CN107894972A

CN107894972A - A kind of session tokens method, apparatus, aggregate server and storage medium

Info

Publication number: CN107894972A
Application number: CN201711130201.3A
Authority: CN
Inventors: 廖大春
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2017-11-15
Filing date: 2017-11-15
Publication date: 2018-04-10

Abstract

The present invention discloses a kind of session tokens method, apparatus, aggregate server and storage medium, including：First session information and the second session information according to corresponding to predetermined session identification obtains session to be marked；Wherein, the first session information includes：Input voice information and the text message for input voice information formed after speech recognition；Second session information includes：The text message and output text message for input voice information formed after speech recognition；The first session information and the second session information are converged according to session identification；Receive the session tokens instruction of user feedback；Wherein, session tokens instruction includes：First session tokens instruct or the instruction of the second session tokens；The first session information after convergence and the second session information are marked respectively according to the instruction of the first session tokens and the instruction of the second session tokens.The input amount of manpower when can reduce voice conversation System Back-end data markers, improve data-handling efficiency.

Description

A kind of session tokens method, apparatus, aggregate server and storage medium

Technical field

The present invention relates to technical field of internet application, more particularly to a kind of session tokens method, apparatus, aggregate server And storage medium.

Background technology

With the fast development of artificial intelligence, the intelligent robot based on voice conversation system is applied to every field, leads to Cross the mode of natural language dialogue, it is possible to achieve the function such as audio-visual amusement, information inquiry, service for life and trip road conditions.

At present, due to being limited by existing voice technology and semantic technology, identification and semantic reason of the machine to voice Solution ability is also needed constantly to improve, therefore regularly collects the interaction data of user and machine, and user interactive data is carried out Checking and mark, it is the prerequisite for improving voice conversation system.In the prior art, user is using after voice conversation system, The interaction results of mistake can not be fed back, therefore the rear end personnel of system need regularly to carry out the interaction data of user Manual verification, the input voice of duplicate customer is verified and screened one by one, and the interaction data of mistake is marked, for The re -training of voice conversation system and improvement.

However, because voice conversation systematic difference scope is wide, it is more using user, so the interaction data quantity ten of user Divide huge.Therefore the data processing method in the prior art in the artificial repeated screening in rear end and mark mass data is used, Human cost is high, and treatment effeciency is low.

The content of the invention

, being capable of basis the embodiments of the invention provide a kind of session tokens method, apparatus, aggregate server and storage medium Interaction data is marked the feedback result of user, the input amount of manpower when reducing voice conversation System Back-end data markers, Improve data-handling efficiency.

In a first aspect, the embodiments of the invention provide a kind of session tokens method, applied to aggregate server, methods described Including：

First session information and the second session information according to corresponding to predetermined session identification obtains session to be marked； Wherein, first session information includes：Input voice information and to the input voice information carry out speech recognition after formed Text message；Second session information includes：The text envelope for the input voice information formed after speech recognition Breath and output text message；

First session information and second session information are converged according to the session identification；

Receive the session tokens instruction of user feedback；Wherein, the session tokens instruction includes：First session tokens instruct Or second session tokens instruction；

According to first session tokens instruction and second session tokens instruction respectively to described first after convergence Session information and second session information are marked.

Second aspect, the embodiments of the invention provide a kind of session tokens device, described device includes：Acquisition module, remittance Poly- module, receiving module and mark module；Wherein,

The acquisition module, for the first session letter according to corresponding to predetermined session identification acquisition session to be marked Breath and the second session information；Wherein, first session information includes：Input voice information and the input voice information is entered The text message formed after row speech recognition；Second session information includes：Voice knowledge is carried out to the input voice information The text message and output text message not formed afterwards；

The convergence module, for according to the session identification by first session information and second session information Converged；

The receiving module, the session tokens for receiving user feedback instruct；Wherein, the session tokens instruction bag Include：First session tokens instruct or the instruction of the second session tokens；

The mark module, for right respectively according to first session tokens instruction and second session tokens instruction First session information and second session information after convergence are marked.

The third aspect, the embodiments of the invention provide a kind of aggregate server, including：

One or more processors；

Memory, for storing one or more programs；

When one or more of programs are by one or more of computing devices so that one or more of processing Device realizes the session tokens method described in any embodiment of the present invention.

Fourth aspect, the embodiments of the invention provide a kind of storage medium, computer program is stored thereon with, the program quilt The session tokens method described in any embodiment of the present invention is realized during computing device.

The embodiment of the present invention proposes a kind of session tokens method, apparatus, aggregate server and storage medium, aggregated service Device can receive the session tokens instruction of user feedback, then be instructed according to session tokens to the first session information and the second session Information is marked.And existing session tokens method, using manual type to the session information and semanteme in voice server Session information in server is marked one by one, therefore, compared to the prior art, the session tokens that the embodiment of the present invention proposes Method, apparatus, aggregate server and storage medium, session information can be marked according to the feedback result of user, so as to Human input amount can be reduced, improves labeling effciency；Also, the technical scheme of the embodiment of the present invention realize it is simple and convenient, be easy to Popularization, the scope of application are wider.

Brief description of the drawings

Fig. 1 is a kind of flow chart for session tokens method that the embodiment of the present invention one provides；

Fig. 2 is a kind of flow chart for session tokens method that the embodiment of the present invention two provides；

Fig. 3 is a kind of structural representation for session tokens device that the embodiment of the present invention three provides；

Fig. 4 is a kind of structural representation for aggregate server that the embodiment of the present invention four provides.

Embodiment

The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention, rather than limitation of the invention.It also should be noted that in order to just Part related to the present invention rather than entire infrastructure are illustrate only in description, accompanying drawing.

Embodiment one

Fig. 1 is a kind of flow chart for session tokens method that the embodiment of the present invention one provides, and the present embodiment is applicable to language The situation that human-machine interaction data is collected in sound conversational system, this method can be performed by session tokens device, the session tokens Device can be realized by the way of software and/or hardware.With reference to figure 1, this method specifically comprises the following steps：

S110, first session information and the second session according to corresponding to predetermined session identification obtains session to be marked Information；Wherein, the first session information includes：Input voice information and to the input voice information carry out speech recognition after formed Text message；Second session information includes：The input voice information is carried out after speech recognition the text message that is formed and Export text message.

In a particular embodiment of the present invention, session identification refers to receive original user conversation language in voice server Generated when sound is input voice information with the unique corresponding information with mark action of input voice information.First session is believed Breath includes the session information after voice server receives and handles, including input voice information and voice server pair Input voice information carries out the text message that is formed after speech recognition, at the same assign the text message that is formed after speech recognition with it is right The unique session identification of input voice information identical answered, with this by session identification by after input voice information and speech recognition The associate text information of formation gets up.

Second session information includes the session information after semantic service device receives and handles, including output text Information and the text message formed after voice server speech recognition, output text message are semantic service device to voice The text message formed after identification carries out the text message formed after semantic parsing and result satisfaction.Wherein, the above results meet Refer to, semantic service device, can be according to the text formed after speech recognition after the text message formed after receiving speech recognition The content of this information inquiry Query Result corresponding with the text message formed after speech recognition in the resources reserve storehouse of local, Or web search is carried out according to the content of the text message formed after speech recognition, obtain in real time with being formed after speech recognition Text message corresponding to search result, and the Query Result or search result that the text message formed after speech recognition will be met Form output text message.Output text message and the text message identical that is formed after corresponding speech recognition are assigned simultaneously only One session identification, the associate text information formed after text message and speech recognition will be exported by session identification with this.

During session interaction, according to session identification corresponding to voice to be marked, aggregate server in real time respectively from The first session information and the second session information in voice server and this time session of semantic service device active obtaining, so as to the later stage Finish message and mark.

Exemplary, the interactive operation of the robot for being integrated with voice conversation system of the embodiment of the present invention, Yong Huke To want the voice content of inquiry or search by the subsidiary microphone output of robot, such as " why is Beijing weather for input voice Sample ".Voice server receives the input voice information, and what is firstly generated is unique corresponding with mark with input voice information The session identification of knowledge effect, such as 001；Secondly speech recognition is carried out to input voice information, forms text message " Beijing weather How ".Assign the text message and the unique session mark of corresponding input voice information identical formed after speech recognition simultaneously Know, i.e., 001.Semantic service device receives the text message and session identification formed after the speech recognition, after speech recognition The text message of formation carries out semantic understanding.When correctly recognizing that user wants obtained response for Beijing weather at this very moment During situation, semantic service device can not obtain corresponding real-time weather situation by being inquired about from the resources reserve storehouse of local, and then Semantic service device carries out web search, obtains corresponding search result.For example, pass through network search query to real-time Beijing day Gas is：Fine, 10 DEG C, northwester 3-4 poles, and then output text message " fine, 10 DEG C, northwester 3-4 poles " is formed, it is met The output text message of the text message formed after speech recognition.And to output text message " fine, 10 DEG C, northwester 3-4 poles " Assign the session identification 001 same with the text message formed after corresponding speech recognition.So far, the institute during voice conversation There is interaction data that all there is corresponding session identification, i.e., 001.Therefore, the session identification 001 according to corresponding to voice to be marked, gather Active obtaining session identification is 001 interaction data to hop server from voice server and semantic service device respectively in real time, The text message " how is Beijing weather " that is formed after input voice information " how is Beijing weather ", speech recognition and defeated Go out text message " fine, 10 DEG C, northwester 3-4 poles ", so as to the later stage finish message and mark when use.

S120, according to session identification the first session information and the second session information are converged.

In a particular embodiment of the present invention, aggregate server is actively obtaining from voice server and semantic service device respectively After taking the first session information and the second session information in this session, session to be marked at present is obtained from voice server Session identification, and the session identification according to corresponding to each information, by with first meeting consistent with the session identification of session to be marked Words information and the second session information are converged, will corresponding to formed after session identification, input voice information, speech recognition Text message and output text message are built into a four-tuple, i.e. (session identification, input voice information, after speech recognition The text message of formation, export text message).And then interaction data of the user in a conversation procedure can be associated Come, so as to the later stage mark and displaying when use.

Exemplary, on the basis of above-mentioned example, due to input voice information " Beijing weather is how ", speech recognition The text message " how is Beijing weather " and output text message " fine, 10 DEG C, northwester 3-4 poles " formed afterwards is each corresponding Session identification be 001, therefore above- mentioned information is pooled into four-tuple, i.e., (001, voice " how is Beijing weather ", text " how is Beijing weather ", text " fine, 10 DEG C, northwester 3-4 poles ").

By the way that the interaction data in a conversation procedure is converged, the arrangement of interaction data, Er Qie are not only convenient for In the data markers and screening in later stage, it is easy to the inquiry of data and the determination of Error Location, and then obtain the effect of data processing Rate.

S130, the session tokens instruction for receiving user feedback；Wherein, the session tokens instruction includes：First session mark Note instruction or the instruction of the second session tokens.

In a particular embodiment of the present invention, the output text message exported by semantic service device is reversely input to voice clothes It is engaged in device, output text message can be converted into corresponding voice by voice server by speech synthesis technique, and pass through machine The device plays such as the subsidiary loudspeaker of device people make user obtain response voice corresponding with the input voice of oneself to user.Therefore, User can feed back the result oneself whether being satisfied with to response voice to robot according to the response voice of acquisition, i.e. polymerization clothes Business device receives the session tokens instruction of user feedback.

Exemplary, in the examples described above, voice server will export text message " fine, 10 DEG C, northwester 3-4 poles " Corresponding response voice is converted into by speech synthesis technique, user hears the response voice after being played by robot, obtained Pekinese's real-time weather situation is arrived, therefore response voice of the user to robot is satisfied.User can without operation or The operation for meeting condition is carried out, and then aggregate server can obtain the mark instructions for meeting condition of user feedback.

Wherein, session tokens instruction includes the instruction of the first session tokens or the instruction of the second session tokens.First session refers to Make the feedback to the first session information for user, feedback of the second session instruction for user to the second session information.When system Identify and stagger the time, i.e., input voice information is identified as the text message of mistake by voice server, and/or semantic service device will be defeated Enter the output text message that text message understands or is converted into mistake, and then cause user to obtain the response sound result of mistake. Therefore, when user is fed back using the equipment without display screen, such as microphone, user can be according to the response languages of mistake Sound feeds back to four-tuple corresponding to this conversation procedure, and then aggregate server receives the first consistent meeting of user feedback Talk about mark instructions and the instruction of the second session tokens.When user uses the equipment with display screen to be fed back, such as mobile phone, The mobile terminals such as tablet personal computer, user can be according to the four-tuples shown on display screen, respectively to the number in different disposal stage It is believed that breath is fed back, the instruction of the first session tokens and/or the second session mark of user feedback are received with this aggregate server Note instruction.

It is exemplary, when input voice information is the text message " north that is formed after " Beijing weather is how ", speech recognition Capital weather is how ", output text message " your current location be Beijing Quanjude restaurant (preceding shops) ", this session mark Know for 002., can be by pressing the button on microphone, instead after then user receives robot on the response voice of Quanjude The session tokens instruction of this time session mistake is presented.Or the first button or the set on mobile phone for the first session information The second button that two session informations are set, user can according to the four-tuple shown on display screen, i.e., (002, voice " Beijing Weather is how ", text " Beijing weather is how ", " your current location is Quanjude restaurant (front door, Beijing to text Shop) "), learn that the second session information has gone out mistake, therefore user can feed back this by pressing the second button on mobile phone Second session information of secondary session instructs for the second session tokens of mistake.Similarly, when the first session information in four-tuple goes out Mistake, user can also be by pressing the first button on mobile phone, and the first session information for having fed back this time session is mistake The first session tokens instruction.The session instruction of above-mentioned user feedback is all to receive to obtain by aggregate server.

S140, according to the first session tokens instruction and the second session tokens instruction respectively to the first session information after convergence It is marked with the second session information.

In a particular embodiment of the present invention, aggregate server is after the session tokens for receiving user feedback instruct, root According to the corresponding session information of session tokens instruction, session information is marked.Exemplary, in the examples described above, user According to shown on screen four-tuple (002, voice " how is Beijing weather ", text " Beijing weather is how ", text " you Location is Beijing Quanjude restaurant (preceding shops) at present ") feed back the instruction of the second session tokens, then aggregate server root According to the instruction of the second session tokens, to the semantic understanding text in interaction data corresponding to session identification 002, " you current location are Beijing Quanjude restaurant (preceding shops) " is marked, and is used when improving voice conversation system for the later stage.

The technical scheme of the present embodiment, aggregate server are obtained voice conversation mistake according to the session identification of voice to be marked The first session information and the second session information in journey are converged, and receive the session tokens instruction of user feedback, with this root The first session information and the second session information are marked according to the session tokens instruction of user feedback.And prior art therefore, Compare, session information can be marked according to the feedback result of user, so as to reduce human input amount, improve mark Efficiency；Also, the technical scheme of the embodiment of the present invention realize it is simple and convenient, be easy to popularize, the scope of application is wider.

Embodiment two

The present embodiment is on the basis of above-described embodiment one, there is provided a preferred embodiment of session tokens method, Interaction data can be converged according to unique session identification, and the interaction text message to the different disposal stage enters rower Note.Fig. 2 is a kind of flow chart for session tokens method that the embodiment of the present invention two provides, as shown in Fig. 2 this method is including following Specific steps：

S210, in voice server according to corresponding to session identification obtains session to be marked first session information.

In a particular embodiment of the present invention, session identification refers to receive original user conversation language in voice server Generated when sound is input voice information with the unique corresponding information with mark action of input voice information.First session is believed Breath includes the session information after voice server receives and handles, including formed after input voice information and speech recognition Text message, the text message formed after speech recognition are formed after voice server carries out speech recognition to input voice information Text message, while assign the text message formed after speech recognition and the corresponding unique session of input voice information identical Mark, the associate text information that will be formed with this by session identification after input voice information and speech recognition.And then root According to session identification corresponding to voice to be marked, it is possible to obtain first session information corresponding with voice to be marked.

S220, in semantic service device according to corresponding to session identification obtains session to be marked second session information；Wherein, Output text message is that semantic service device carries out shape after semantic parsing and result satisfaction to the text message formed after speech recognition Into text message.

In a particular embodiment of the present invention, the second session information includes the meeting after semantic service device receives and handles Information is talked about, including exports text message and the text message formed after voice server speech recognition, exports text envelope Cease the text envelope for carrying out being formed after semantic parsing and result satisfaction to the text message formed after speech recognition for semantic service device Breath.Wherein, the above results meet to refer to, semantic service device is after the text message formed after receiving speech recognition, Ke Yigen The text with being formed after speech recognition is inquired about in the resources reserve storehouse of local according to the content of the text message formed after speech recognition Query Result corresponding to this information, or web search is carried out according to the content of the text message formed after speech recognition, in real time Ground obtains search result corresponding with the text message formed after speech recognition, and will meet the text envelope formed after speech recognition The Query Result or search result of breath form output text message.Simultaneously assign output text message with after corresponding speech recognition The unique session identification of text message identical of formation, shape after text message and speech recognition will be exported by session identification with this Into associate text information get up.And then the session identification according to corresponding to voice to be marked, it is possible to obtain and voice to be marked Corresponding second session information.

S230, by the text message formed after session identification, input voice information, speech recognition and output text message converge Gather for a four-tuple corresponding with session to be marked.

In a particular embodiment of the present invention, aggregate server session identification according to corresponding to each information, obtain and wait to mark Remember the first session information and the second session information corresponding to voice, will corresponding session identification, input voice information, voice knowledge The text message that is not formed afterwards and output text message, and by above-mentioned four information architectures corresponding with voice to be marked into one Individual four-tuple, i.e., (session identification, input voice information, the text message formed after speech recognition, export text message).And then All interaction datas of the user in a conversation procedure can be associated, so as to the later stage mark and displaying when use.

S240, receive the session tokens instruction that user is fed back by the first feedback device or the second feedback device；Wherein, First feedback device is two different feedback devices from the second feedback device.

In a particular embodiment of the present invention, the output text message exported by semantic service device is reversely input to voice clothes It is engaged in device, output text message can be converted into corresponding voice by voice server by speech synthesis technique, and pass through machine The device plays such as the subsidiary loudspeaker of device people make user obtain response voice corresponding with the input voice of oneself to user.Therefore, User can feed back the result oneself whether being satisfied with to response voice to robot according to the response voice of acquisition, i.e. polymerization clothes Business device receives the session tokens instruction of user feedback.Wherein, session tokens instruction includes the first session tokens and instructed or the Two session tokens instruct.First session instruction is user to the feedback of the first session information, and the second session instruction is user to the The feedback of two session informations.

First feedback device refers to no touch display screen but the equipment with operation button, such as microphone etc., user The result of voice conversation can be fed back by pressing the button on the first feedback device.Second feedback device refers to have There is the equipment of touch display screen curtain, such as the mobile terminal such as mobile phone and tablet personal computer, user can show according on display screen Interaction data four-tuple, by touch press button different on touch display screen curtain, come respectively to different disposal rank with this The interaction data information of section is fed back.

Exemplary, when identifying for system is staggered the time, i.e., input voice information is identified as the text of mistake by voice server The text message formed after speech recognition is understood or is converted into the output text envelope of mistake by this information, and/or semantic service device Breath, and then cause user to obtain the response sound result of mistake.Therefore, when user is fed back using the first feedback device, User can feed back according to the response voice of mistake to four-tuple corresponding to this conversation procedure, press on the first feedback device Button, and then aggregate server receives consistent the first session tokens instruction of user feedback and the second session tokens refer to Order.When user uses the first feedback device, user according to mistake response voice learn this interaction results be it is wrong, And want to feed back voice conversation system, then user further can judge to hand over according to the four-tuple shown on display screen There is mistake in stage in which mutual, and then the interaction data information in different disposal stage is fed back respectively, passes through touch Button different on touch display screen curtain is pressed, the first session tokens that user feedback is received with this aggregate server instruct And/or second session tokens instruction.

S250, according to the first session tokens instruction and the second session tokens instruction respectively to the first session information after convergence It is marked with the second session information.

Preferably, when the instruction of the first session tokens and the instruction of the second session tokens are customer satisfaction system mark instructions, The first session information after convergence and the second session information are respectively labeled as customer satisfaction system session information；Or when first When session tokens instruct and the instruction of the second session tokens is user's unsatisfied mark instructions, the first session after convergence is believed Breath and the second session information are respectively labeled as the unsatisfied session information of user；Or when the instruction of the first session tokens is user When satisfied mark instructions and the second session tokens instruct mark instructions unsatisfied for user, the first session after convergence is believed Breath and the second session information are respectively labeled as customer satisfaction system session information and the unsatisfied session information of user；Or when When one session mark instructions are the unsatisfied mark instructions of user and the instruction of the second session tokens is customer satisfaction system mark instructions, The first session information after convergence and the second session information are respectively labeled as the unsatisfied session information of user and user's satisfaction Session information.

In a particular embodiment of the present invention, customer satisfaction system mark instructions represent that voice to be marked passes through voice conversation system Obtained response voice of uniting is correct, and user is met the result of self-demand, conversely, the unsatisfied mark of user refers to Order represents that the response voice that voice to be marked obtains by voice conversation system is wrong, and user has obtained being unsatisfactory for itself need The result asked.Therefore, instructed according to the session tokens of user feedback, different session informations is marked.

The technical scheme of the present embodiment, aggregate server is by the session identification of session to be marked by during voice conversation The first session information and the second session information converged, and the session of user can be received from different types of feedback device Mark instructions, the first session information and the second session information are marked according to the instruction of the session tokens of user feedback with this. Therefore, compared to the prior art, session information can be marked according to the feedback result of user, so as to reduce manpower Input amount, improve labeling effciency；Also, the technical scheme of the embodiment of the present invention realize it is simple and convenient, be easy to popularize, the scope of application It is wider.

Embodiment three

Fig. 3 is a kind of structural representation for session tokens device that the embodiment of the present invention three provides, and the present embodiment is applicable In the situation of voice conversation Method of Man-computer Interaction Data Collection, the device can realize the session described in any embodiment of the present invention Labeling method.The device specifically includes：

Acquisition module 310, for the first session letter according to corresponding to predetermined session identification acquisition session to be marked Breath and the second session information；Wherein, first session information includes：Input voice information and the input voice information is entered The text message formed after row speech recognition；Second session information includes：Voice knowledge is carried out to the input voice information The text message and output text message not formed afterwards；

Convergence module 320, for according to the session identification by first session information and second session information Converged；

Receiving module 330, the session tokens for receiving user feedback instruct；Wherein, the session tokens instruction includes： First session tokens instruct or the instruction of the second session tokens；

Mark module 340, for right respectively according to first session tokens instruction and second session tokens instruction First session information and second session information after convergence are marked.

Further, the acquisition module 310, including：

First acquisition unit 3101, for obtaining the session to be marked according to the session identification in voice server Corresponding first session information；

Second acquisition unit 3102, for obtaining the session to be marked according to the session identification in semantic service device Corresponding second session information；Wherein, the output text message is the semantic service device to being formed after the speech recognition Text message carry out semantic parsing and result meet after the text message that is formed.

Further, the convergence module 320, specifically for by the session identification, input voice information, described The text message and the output text message convergence formed after speech recognition is one and the session corresponding four to be marked Tuple.

Further, the receiving module 330, the first feedback device or second are passed through specifically for receiving the user The session tokens instruction of feedback device feedback；Wherein, first feedback device and second feedback device are two Different feedback devices.

Further, the mark module 340, specifically for when first session tokens instruction and second session When mark instructions are customer satisfaction system mark instructions, by first session information after convergence and second session information It is respectively labeled as customer satisfaction system session information；Or when first session tokens instruction and second session tokens refer to When order is user's unsatisfied mark instructions, first session information after convergence and second session information are distinguished Labeled as the unsatisfied session information of user；Or when first session tokens instruction refers to for the customer satisfaction system mark When order and second session tokens instruct mark instructions unsatisfied for the user, first session after convergence is believed Breath and second session information are respectively labeled as the customer satisfaction system session information and the unsatisfied session letter of the user Breath；Or when first session tokens instruction refers to for the unsatisfied mark instructions of the user and second session tokens Make for the customer satisfaction system mark instructions when, first session information after convergence and second session information are distinguished Labeled as the unsatisfied session information of the user and the customer satisfaction system session information.

The technical scheme of the present embodiment, by the mutual cooperation between modules, realize acquisition, the session of session information The reception and session information is marked according to session tokens instruction that the session tokens of convergence, the user feedback of information instruct Etc. function, compared to the prior art, session information can be marked according to the feedback result of user, so as to reduce people Power input amount, improve labeling effciency；Also, the technical scheme of the embodiment of the present invention realize it is simple and convenient, be easy to popularize, be applicable model Enclose wider.

Example IV

Fig. 4 is a kind of structural representation for aggregate server that the embodiment of the present invention four provides.Fig. 4 is shown suitable for being used for Realize the block diagram of the exemplary aggregate server 12 of embodiment of the present invention.The aggregate server 12 that Fig. 4 is shown is only one Example, any restrictions should not be brought to the function and use range of the embodiment of the present invention.

As shown in figure 4, aggregate server 12 is showed in the form of universal computing device.The component of aggregate server 12 can To include but is not limited to：One or more processor or processing unit 16, system storage 28, connect different system group The bus 18 of part (including system storage 28 and processing unit 16).

Bus 18 represents the one or more in a few class bus structures, including memory bus or Memory Controller, Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.Lift For example, these architectures include but is not limited to industry standard architecture (ISA) bus, MCA (MAC) Bus, enhanced isa bus, VESA's (VESA) local bus and periphery component interconnection (PCI) bus.

Aggregate server 12 typically comprises various computing systems computer-readable recording medium.These media can be it is any can be by The usable medium that aggregate server 12 accesses, including volatibility and non-volatile media, moveable and immovable medium.

System storage 28 can include the computer system readable media of form of volatile memory, such as arbitrary access Memory (RAM) 30 and/or cache memory 32.Aggregate server 12 may further include it is other it is removable/can not Mobile, volatile/non-volatile computer system storage medium.Only as an example, storage system 34 can be used for read-write not Movably, non-volatile magnetic media (Fig. 4 is not shown, is commonly referred to as " hard disk drive ").Although not shown in Fig. 4, can with There is provided for the disc driver to may move non-volatile magnetic disk (such as " floppy disk ") read-write, and to removable non-volatile The CD drive of CD (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these cases, each driving Device can be connected by one or more data media interfaces with bus 18.Memory 28 can include at least one program and produce Product, the program product have one group of (for example, at least one) program module, and these program modules are configured to perform of the invention each The function of embodiment.

Program/utility 40 with one group of (at least one) program module 42, such as memory 28 can be stored in In, such program module 42 include but is not limited to operating system, one or more application program, other program modules and Routine data, the realization of network environment may be included in each or certain combination in these examples.Program module 42 is usual Perform the function and/or method in embodiment described in the invention.

Aggregate server 12 can also be with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 Deng) communication, the equipment communication interacted with the aggregate server 12 can be also enabled a user to one or more, and/or with making Obtain any equipment that the aggregate server 12 can be communicated with one or more of the other computing device (such as network interface card, modulatedemodulate Adjust device etc.) communication.This communication can be carried out by input/output (I/O) interface 22.Also, aggregate server 12 may be used also To pass through network adapter 20 and one or more network (such as LAN (LAN), wide area network (WAN) and/or public network Network, such as internet) communication.As illustrated, network adapter 20 is led to by other modules of bus 18 and aggregate server 12 Letter.It should be understood that although not shown in the drawings, can combine aggregate server 12 uses other hardware and/or software module, bag Include but be not limited to：Microcode, device driver, redundant processing unit, external disk drive array, RAID system, magnetic tape drive Device and data backup storage system etc..

Processing unit 16 is stored in program in system storage 28 by operation, so as to perform various function application and Data processing, such as realize the session tokens method that the embodiment of the present invention is provided.

Embodiment five

The embodiment of the present invention five also provides a kind of computer-readable recording medium, be stored thereon with computer program (or For computer executable instructions), it is used to perform a kind of session tokens method when the program is executed by processor, this method includes：

The computer-readable storage medium of the embodiment of the present invention, any of one or more computer-readable media can be used Combination.Computer-readable medium can be computer-readable signal media or computer-readable recording medium.It is computer-readable Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or Device, or any combination above.The more specifically example (non exhaustive list) of computer-readable recording medium includes：Tool There are the electrical connections of one or more wires, portable computer diskette, hard disk, random access memory (RAM), read-only storage (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only storage (CD- ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer-readable storage Medium can be any includes or the tangible medium of storage program, the program can be commanded execution system, device or device Using or it is in connection.

Computer-readable signal media can include in a base band or as carrier wave a part propagation data-signal, Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can Any computer-readable medium beyond storage medium is read, the computer-readable medium, which can send, propagates or transmit, to be used for By instruction execution system, device either device use or program in connection.

The program code included on computer-readable medium can be transmitted with any appropriate medium, including --- but it is unlimited In wireless, electric wire, optical cable, RF etc., or above-mentioned any appropriate combination.

It can be write with one or more programming languages or its combination for performing the computer that operates of the present invention Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++, Also include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with Fully perform, partly perform on the user computer on the user computer, the software kit independent as one performs, portion Divide and partly perform or performed completely on remote computer or server on the remote computer on the user computer. Be related in the situation of remote computer, remote computer can pass through the network of any kind --- including LAN (LAN) or Wide area network (WAN)-be connected to subscriber computer, or, it may be connected to outer computer (such as carried using Internet service Pass through Internet connection for business).

Pay attention to, above are only presently preferred embodiments of the present invention and institute's application technology principle.It will be appreciated by those skilled in the art that The invention is not restricted to specific embodiment described here, can carry out for a person skilled in the art various obvious changes, Readjust and substitute without departing from protection scope of the present invention.Therefore, although being carried out by above example to the present invention It is described in further detail, but the present invention is not limited only to above example, without departing from the inventive concept, also Other more equivalent embodiments can be included, and the scope of the present invention is determined by scope of the appended claims.

Claims

A kind of 1. session tokens method, it is characterised in that applied to aggregate server, methods described includes：

First session information and the second session information according to corresponding to predetermined session identification obtains session to be marked；Its In, first session information includes：Input voice information and formed after carrying out speech recognition to the input voice information Text message；Second session information includes：The text message for the input voice information formed after speech recognition With output text message；

First session information and second session information are converged according to the session identification；

Receive the session tokens instruction of user feedback；Wherein, the session tokens instruction includes：First session tokens instruct or Second session tokens instruct；

According to first session tokens instruction and second session tokens instruction respectively to first session after convergence Information and second session information are marked.
2. according to the method for claim 1, it is characterised in that described to be marked according to the acquisition of predetermined session identification First session information corresponding to session and the second session information, including：

In voice server according to corresponding to the session identification obtains the session to be marked first session information；

In semantic service device according to corresponding to the session identification obtains the session to be marked second session information；Wherein, The output text message be the semantic service device text message that is formed after the speech recognition is carried out semantic parsing and As a result the text message formed after meeting.
3. according to the method for claim 2, it is characterised in that described to be believed first session according to the session identification Breath and second session information are converged, including：

By the text message formed after the session identification, the input voice information, the speech recognition and the output text This converging information is a four-tuple corresponding with the session to be marked.
4. according to the method for claim 1, it is characterised in that the mark instructions for receiving user feedback, including：

The session tokens that the user is fed back by the first feedback device or the second feedback device are received to instruct；Wherein, First feedback device is two different feedback devices from second feedback device.
5. according to the method for claim 1, it is characterised in that described according to first session tokens instruction and described the First session information after convergence and second session information are marked respectively for the instruction of two session tokens, including：

When first session tokens instruction and second session tokens instruction are customer satisfaction system mark instructions, will converge First session information and second session information after poly- are respectively labeled as customer satisfaction system session information；Or

, will when first session tokens instruction and second session tokens instruction are the unsatisfied mark instructions of user First session information and second session information after convergence are respectively labeled as the unsatisfied session information of user；Or Person,

It is institute when first session tokens are instructed as the customer satisfaction system mark instructions and second session tokens instruction When stating the unsatisfied mark instructions of user, first session information after convergence and second session information are marked respectively For the customer satisfaction system session information and the unsatisfied session information of the user；Or

When first session tokens instruction is for the unsatisfied mark instructions of the user and second session tokens instruction During the customer satisfaction system mark instructions, first session information after convergence and second session information are marked respectively For the unsatisfied session information of the user and the customer satisfaction system session information.
6. a kind of session tokens device, it is characterised in that described device includes：Acquisition module, convergence module, receiving module and mark Remember module；Wherein,

The acquisition module, for the first session information corresponding to obtaining session to be marked according to predetermined session identification and Second session information；Wherein, first session information includes：Input voice information and to the input voice information carry out language The text message formed after sound identification；Second session information includes：After speech recognition being carried out to the input voice information The text message and output text message of formation；

The convergence module, for being carried out first session information and second session information according to the session identification Convergence；

The receiving module, the session tokens for receiving user feedback instruct；Wherein, the session tokens instruction includes：The One session mark instructions or the instruction of the second session tokens；

The mark module, for being instructed respectively to convergence according to first session tokens instruction and second session tokens First session information and second session information afterwards is marked.
7. device according to claim 6, it is characterised in that the acquisition module includes：First acquisition unit and second Acquiring unit；Wherein,

The first acquisition unit, it is corresponding for obtaining the session to be marked according to the session identification in voice server The first session information；

The second acquisition unit, it is corresponding for obtaining the session to be marked according to the session identification in semantic service device The second session information；Wherein, the output text message is the semantic service device to the text envelope that is formed after speech recognition Breath carries out the text message formed after semantic parsing and result satisfaction.
8. device according to claim 7, it is characterised in that：

The convergence module, specifically for will be formed after the session identification, the input voice information, the speech recognition Text message and the output text message convergence are a four-tuple corresponding with the session to be marked.
9. device according to claim 6, it is characterised in that：

The receiving module, the institute fed back specifically for receiving the user by the first feedback device or the second feedback device State session tokens instruction；Wherein, first feedback device is two different feedback devices from second feedback device.
10. device according to claim 6, it is characterised in that：The mark module, is specifically used for：

When first session tokens instruction and second session tokens instruction are customer satisfaction system mark instructions, will converge First session information and second session information after poly- are respectively labeled as customer satisfaction system session information；Or

, will when first session tokens instruction and second session tokens instruction are the unsatisfied mark instructions of user First session information and second session information after convergence are respectively labeled as the unsatisfied session information of user；Or Person,

It is institute when first session tokens are instructed as the customer satisfaction system mark instructions and second session tokens instruction When stating the unsatisfied mark instructions of user, first session information after convergence and second session information are marked respectively For the customer satisfaction system session information and the unsatisfied session information of the user；Or

When first session tokens instruction is for the unsatisfied mark instructions of the user and second session tokens instruction During the customer satisfaction system mark instructions, first session information after convergence and second session information are marked respectively For the unsatisfied session information of the user and the customer satisfaction system session information.
A kind of 11. aggregate server, it is characterised in that including：

One or more processors；

Memory, for storing one or more programs；

When one or more of programs are by one or more of computing devices so that one or more of processors are real The now session tokens method as any one of claim 1 to 5.
12. a kind of computer-readable recording medium, is stored thereon with computer program, it is characterised in that the program is by processor The session tokens method as any one of claim 1 to 5 is realized during execution.