CN113793612B - Updating method and device of model service and storage medium - Google Patents

Updating method and device of model service and storage medium Download PDF

Info

Publication number
CN113793612B
CN113793612B CN202111081195.3A CN202111081195A CN113793612B CN 113793612 B CN113793612 B CN 113793612B CN 202111081195 A CN202111081195 A CN 202111081195A CN 113793612 B CN113793612 B CN 113793612B
Authority
CN
China
Prior art keywords
service
model
audio stream
request
model service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111081195.3A
Other languages
Chinese (zh)
Other versions
CN113793612A (en
Inventor
崔午阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Information Technology Co Ltd
Original Assignee
Jingdong Technology Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Information Technology Co Ltd filed Critical Jingdong Technology Information Technology Co Ltd
Priority to CN202111081195.3A priority Critical patent/CN113793612B/en
Publication of CN113793612A publication Critical patent/CN113793612A/en
Application granted granted Critical
Publication of CN113793612B publication Critical patent/CN113793612B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/133Protocols for remote procedure calls [RPC]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0635Training updating or merging of old and new templates; Mean values; Weighting

Abstract

The present disclosure relates to a method, an apparatus, and a storage medium for updating a model service, where the method includes: receiving a service request through a management center, and analyzing the service request to obtain a request identifier, wherein the request identifier is used for indicating a session corresponding to the service request; determining a model service corresponding to the session according to the request identifier by using a consistent hash algorithm; transmitting a plurality of service requests belonging to the same session to the model service through a first channel, wherein each service request comprises an audio stream fragment; and forming a plurality of audio stream fragments into a complete audio stream, and updating the model service by using the complete audio stream. By adopting the technical means, the problem that in the prior art, the model service can only be updated based on the business request and the model service can not be updated based on the session is solved.

Description

Updating method and device of model service and storage medium
Technical Field
The disclosure relates to the technical field of internet, and in particular relates to a method and a device for updating a model service and a storage medium.
Background
With the widespread use of artificial intelligence technology, the importance of algorithmic model engineering work is increasingly highlighted. In the field of speech artificial intelligence, as more and more access cases exist in a telephone system as an input end of a whole speech service, the lossless online of speech model service is realized, and the speech model service becomes an important subject of the whole speech engineering. The conventional technology is to update a model service based on a service request, but the model service needs to record a session and context information to which a current voice stream belongs, and each context information records a voice stream fragment and configuration information of the session to which the current voice stream belongs in the whole request process. While there are multiple service requests for each session and each request will upload different piece information for the entire voice stream, the update model service should be based on the dimensions of the session instead of the dimensions of the service request.
In the process of implementing the disclosed concept, the inventor finds that at least the following technical problems exist in the related art: the model service can be updated only based on the business request, and the problem of the model service cannot be updated based on the session.
Disclosure of Invention
In order to solve the above technical problems or at least partially solve the above technical problems, embodiments of the present disclosure provide a method, an apparatus, and a storage medium for updating a model service, so as to at least solve the problem in the prior art that only the model service can be updated based on a service request, but the model service cannot be updated based on a session.
The aim of the disclosure is achieved by the following technical scheme:
in a first aspect, an embodiment of the present disclosure provides a method for updating a model service, including: receiving a service request through a management center, and analyzing the service request to obtain a request identifier, wherein the request identifier is used for indicating a session corresponding to the service request; determining a model service corresponding to the session according to the request identifier by using a consistent hash algorithm; transmitting a plurality of service requests belonging to the same session to the model service through a first channel, wherein each service request comprises an audio stream fragment; and forming a plurality of audio stream fragments into a complete audio stream, and updating the model service by using the complete audio stream.
In an exemplary embodiment, the sending, through a first channel, a plurality of service requests belonging to the same session to the model service includes: under the condition that a model online request is received, a second channel is established between the management center and a standby cache service, wherein the standby cache service has a corresponding relation with the model service; transmitting a plurality of service requests belonging to a current session to the model service through the first channel; and sending a plurality of business requests belonging to the call back after the current session to the standby cache service through the second channel.
In an exemplary embodiment, after the sending, through the first channel, the plurality of service requests belonging to the current session to the model service, the method further includes: disconnecting the first channel; and forming the plurality of audio stream fragments carried by the plurality of service requests of the current session into the complete audio stream, and updating the model service by using the complete audio stream.
In an exemplary embodiment, after said updating said model service using said complete audio stream, said method further comprises: reestablishing the first channel between the management center and the model service; transmitting a plurality of service requests belonging to the current session to the standby cache service through the second channel; and sending a plurality of business requests belonging to the call back after the current session to the model service through the first channel.
In an exemplary embodiment, after the sending, through the second channel, the plurality of service requests belonging to the current session to the standby cache service, the method further includes: disconnecting the second channel; and when the model service is updated next time, updating the model service according to the data in the standby cache service.
In an exemplary embodiment, the updating the model service according to the data in the backup cache service when the model service is updated next time includes: the plurality of audio stream fragments carried by a plurality of service requests in the standby cache service are formed into the complete audio stream; the model service is updated using the complete audio stream.
In an exemplary embodiment, the determining, using a consistent hashing algorithm, a model service corresponding to the session according to the request identifier includes: calculating a first hash value of the service request according to the request identifier; acquiring node information of a plurality of model services, and calculating second hash values of the plurality of model services according to the node information; and determining the model service corresponding to the session from the plurality of model services according to the first hash value and the second hash value.
In an exemplary embodiment, the grouping the plurality of audio stream segments into a complete audio stream, updating the model service using the complete audio stream, includes: according to the sequence of the plurality of service requests received by the model service or the identification information of the plurality of audio stream fragments, the plurality of audio stream fragments are formed into the complete audio stream; one or more models within the model service are updated using the complete audio stream.
In a second aspect, an embodiment of the present disclosure provides an updating apparatus for a model service, including: the receiving module is used for receiving a service request through the management center, analyzing the service request to obtain a request identifier, wherein the request identifier is used for indicating a session corresponding to the service request; the determining module is used for determining the model service corresponding to the session according to the request identifier by using a consistent hash algorithm; a sending module, configured to send a plurality of service requests belonging to the same session to the model service through a first channel, where each service request includes an audio stream segment; and the updating module is used for forming a plurality of audio stream fragments into a complete audio stream and updating the model service by using the complete audio stream.
In a third aspect, embodiments of the present disclosure provide an electronic device. The electronic equipment comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus; a memory for storing a computer program; and a processor for realizing the updating method of the model service or the image processing method as described above when executing the program stored on the memory.
In a fourth aspect, embodiments of the present disclosure provide a computer-readable storage medium. The computer-readable storage medium stores a computer program which, when executed by a processor, implements the method for updating the model service or the method for image processing as described above.
Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has at least part or all of the following advantages: receiving a service request through a management center, and analyzing the service request to obtain a request identifier, wherein the request identifier is used for indicating a session corresponding to the service request; determining a model service corresponding to the session according to the request identifier by using a consistent hash algorithm; transmitting a plurality of service requests belonging to the same session to the model service through a first channel, wherein each service request comprises an audio stream fragment; and forming a plurality of audio stream fragments into a complete audio stream, and updating the model service by using the complete audio stream. Because the embodiment of the disclosure can send the plurality of service requests belonging to the same session to the model service through the first channel, and form the plurality of audio stream fragments carried by the plurality of service requests into the complete audio stream, and update the model service by using the complete audio stream, the technical means can solve the problems that in the prior art, only the model service can be updated based on the service requests and the model service cannot be updated based on the session, and further provide a method for updating the model service based on the session dimension.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings that are required to be used in the description of the embodiments or the related art will be briefly described below, and it will be apparent to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
Fig. 1 schematically illustrates a hardware block diagram of a computer terminal of a method for updating a model service according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a method of updating a model service in accordance with an embodiment of the present disclosure;
FIG. 3 schematically illustrates a schematic diagram of a method of updating a model service according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates a block diagram of a model service updating apparatus according to an embodiment of the present disclosure;
fig. 5 schematically illustrates a block diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
The present disclosure will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
The method embodiments provided by the embodiments of the present disclosure may be performed in a computer terminal or similar computing device. Taking a computer terminal as an example, fig. 1 schematically shows a hardware block diagram of a computer terminal of a method for updating a model service according to an embodiment of the present disclosure. As shown in fig. 1, the computer terminal may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor (Microprocessor Unit, abbreviated MPU) or a programmable logic device (Programmable logic device, abbreviated PLD)) and the like and a memory 104 for storing data, and optionally, the above-mentioned computer terminal may further include a transmission device 106 for communication functions and an input-output device 108.
The memory 104 may be used to store computer programs, such as software programs of application software and modules, such as computer programs corresponding to the method of updating model services in the embodiments of the present disclosure, and the processor 102 executes the computer programs stored in the memory 104 to perform various functional applications and data processing, i.e., to implement the methods described above. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located relative to the processor 102, which may be connected to the computer terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of a computer terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is configured to communicate with the internet wirelessly.
In the embodiment of the present disclosure, a method for updating a model service is provided, and fig. 2 schematically shows a flowchart of a method for updating a model service according to an embodiment of the present disclosure, as shown in fig. 2, where the flowchart includes the following steps:
step S202, receiving a service request through a management center, and analyzing the service request to obtain a request identifier, wherein the request identifier is used for indicating a session corresponding to the service request;
step S204, determining the model service corresponding to the session according to the request identification by using a consistent hash algorithm;
step S206, a plurality of service requests belonging to the same session are sent to the model service through a first channel, wherein each service request comprises an audio stream fragment;
step S208, a plurality of audio stream fragments are formed into a complete audio stream, and the model service is updated by using the complete audio stream.
According to the method and the device, a service request is received through a management center, and is analyzed to obtain a request identifier, wherein the request identifier is used for indicating a session corresponding to the service request; determining a model service corresponding to the session according to the request identifier by using a consistent hash algorithm; transmitting a plurality of service requests belonging to the same session to the model service through a first channel, wherein each service request comprises an audio stream fragment; and forming a plurality of audio stream fragments into a complete audio stream, and updating the model service by using the complete audio stream. Because the embodiment of the disclosure can send the plurality of service requests belonging to the same session to the model service through the first channel, and form the plurality of audio stream fragments carried by the plurality of service requests into the complete audio stream, and update the model service by using the complete audio stream, the technical means can solve the problems that in the prior art, only the model service can be updated based on the service requests and the model service cannot be updated based on the session, and further provide a method for updating the model service based on the session dimension.
In step S206, sending a plurality of service requests belonging to the same session to the model service through a first channel, including: under the condition that a model online request is received, a second channel is established between the management center and a standby cache service, wherein the standby cache service has a corresponding relation with the model service; transmitting a plurality of service requests belonging to a current session to the model service through the first channel; and sending a plurality of business requests belonging to the call back after the current session to the standby cache service through the second channel.
It should be noted that, the model online request is used to indicate updating the model service, and the model service cannot receive information any more when the model service is updated, so when the online request is received, a second channel needs to be established between the management center and the standby cache service, and when the model service is updated, the information which should be originally sent to the model service is received through the standby cache service. The model service may be understood as an instance, each instance corresponding to one or a class of models. If the cluster mode deployment is performed, a plurality of instances corresponding to the model service exist. Since the service and the model are in a 1-to-1 relationship in the micro service domain, deploying the model is equivalent to updating the model resource file on the model service.
The algorithm model is generally updated iteratively with different products and projects, including various types of customized scenes. Different clients can have different requirements on model accuracy in different fields, so that in the iterative process of products or projects, the algorithm model needs to be continuously subjected to accuracy optimization and retraining, and a new model can be output after each optimization. New model files are already different from old models and need to be brought up to the production environment. The accuracy of the new model is higher, so that the model update in the embodiment of the disclosure can improve the accuracy of the algorithm.
In the prior art, only the model service can be updated based on the service request, which causes traffic loss.
After sending a plurality of service requests belonging to a current session to the model service through the first channel, the method further comprises: disconnecting the first channel; and forming the plurality of audio stream fragments carried by the plurality of service requests of the current session into the complete audio stream, and updating the model service by using the complete audio stream.
According to the embodiment of the disclosure, after a plurality of service requests belonging to a current session are sent to the model service, the model service is updated according to the plurality of audio stream fragments carried by the plurality of service requests of the current session, so that the loss caused by updating the model service based on the service requests in the prior art can be avoided. Because in the prior art, the model service is updated based on the service request, the current session cannot be waited until all the requests are completed according to the current session, so that the subsequent information of the current session is lost, even if the subsequent service request can be normally sent, the current session is not complete, the subsequent obtained result cannot be associated with the current session, and the traffic of the current session is actually lost. The correct approach would be to switch the connection over to the backup caching service again when the current session is completely over, after which the session is not lost, either in its entirety or in its lossless nature, since it is already being processed on the backup caching service.
The processing capacity of the standby cache service and the processing capacity of the model service are the same, so that the standby cache service can completely meet the load requirement when the model service is updated. In order to ensure that the model service online process is performed smoothly, each model service can be attached with 1 standby cache service, which depends on the configuration of the physical machine or virtual machine where the model service instance is located.
When the model service is online, the standby cache service carries traffic that would otherwise be diverted to the top of the model service. Since the model service needs to be taken off line and restarted, which causes short-term unavailability of the model service, this portion of the traffic needs to be received and processed by the backup caching service at this time.
After the model service is updated, the traffic brought by the session which is not ended on the standby cache service is processed on the standby cache service, and then the traffic of the new request and the session is sent to the model service again and is not shunted to the standby cache service, so that the traffic of the standby cache service is slowly exhausted.
After updating the model service using the complete audio stream, the method further comprises: reestablishing the first channel between the management center and the model service; transmitting a plurality of service requests belonging to the current session to the standby cache service through the second channel; and sending a plurality of business requests belonging to the call back after the current session to the model service through the first channel.
After updating the model service, the model service may receive information again, at this time, reestablish the first channel, and send, through the first channel, a plurality of service requests pertaining to the call back after the current session to the model service, where the call back after the current session may be one session or a plurality of sessions, and at this time, the number of sessions sent is determined by the time required for updating the model service, that is, the session after the completion time of updating the model service is received by the model service. Because the disclosed embodiments are session-based, the session at the completion time of updating the model service is still received by the standby cache service.
The first channel means a connection channel established with the model service. Since the model needs to be disconnected when the model service is updated before, the original first channel connection is closed, and the first channel is destroyed at the same time. The meaning of the re-established first channel is that the management center re-establishes the channel with the model service after restoring the connection with the model service. The first channels, which are not identical, are recreated from the point of view of the connection, and the first channels, which are not identical in nature, do not belong to one connection.
After sending the plurality of service requests belonging to the current session to the standby cache service through the second channel, the method further includes: disconnecting the second channel; and when the model service is updated next time, updating the model service according to the data in the standby cache service.
After updating the model service with the complete audio stream and after sending a plurality of service requests belonging to the current session to the backup caching service via the second channel, the second channel is disconnected. The data in the backup cache service may be used to update the model service the next time the model service is updated. The data in the standby cache service is a request or information accepted in the standby cache service.
The embodiment of the disclosure can realize the lossless online of the model service in the process of streaming end-to-end voice interaction.
Stream processing: typically the processing of an audio stream. For example, in a speech recognition scenario, real-time speech transcription text is returned while speaking. In this process, the requested audio needs to be sent in the form of an audio clip or an audio stream, and the server side returns the identification result of the current audio clip immediately after receiving the audio stream. In a speech recognition scenario, if only the currently requested audio clip is processed without combining the context and the context information, the recognition accuracy will be poor, so streaming needs to guarantee: 1) The requested data needs to be sent in a streaming mode; 2) The processing result of the current request data stream can be returned in real time immediately; 3) The processing result of the data stream needs to be cached in combination with the previously uploaded part, and more accurate result is given in combination with the context information.
Non-streaming processing: for example, to identify commodity information in a picture or to perform text transcription on a telephone recording file, in this process, the object of the request is "a picture" or "an audio file", and after the request is submitted, the algorithm model returns the processing result of the whole image or the voice transcription result of the whole audio at one time. In this process, we cannot see the state of this picture processing procedure, or the real-time state of converting the voice into the text (such as speaking and outputting the word in the voice-to-word process), and the corresponding result will not be returned until the request is completed, which is the non-streaming process.
Model online: the model is packaged, released and updated to an online environment, and at this time, the audio stream request being processed is influenced because the model file needs to be updated and reloaded, so that the audio stream identification result of the current request is not obtained.
Model lossless online: during the model online process, all traffic (including the task being processed and the results returned) will not have any impact.
When the model service is updated next time, updating the model service according to the data in the standby cache service comprises the following steps: the plurality of audio stream fragments carried by a plurality of service requests in the standby cache service are formed into the complete audio stream; the model service is updated using the complete audio stream.
In step S204, determining, using a consistent hashing algorithm, a model service corresponding to the session according to the request identifier, including: calculating a first hash value of the service request according to the request identifier; acquiring node information of a plurality of model services, and calculating second hash values of the plurality of model services according to the node information; and determining the model service corresponding to the session from the plurality of model services according to the first hash value and the second hash value.
The consistent hash algorithm is a special hash algorithm and is mainly used for solving the problem of uneven distribution of the hash algorithm and avoiding the problem of partial model service overload caused by uneven traffic load. According to the embodiment of the disclosure, a first hash value of the service request is calculated according to the request identifier, and the first hash value is mapped to 0-2 32 Is a circle; and calculating second hash values of the plurality of model services according to the node information, and mapping the second hash values to the circles. And on the circle, taking the first hash value as a starting point, searching a second hash value corresponding to the first hash value anticlockwise, and determining the model service corresponding to the session from the plurality of model services.
In step S208, composing a plurality of audio stream segments into a complete audio stream, updating the model service using the complete audio stream, including: according to the sequence of the plurality of service requests received by the model service or the identification information of the plurality of audio stream fragments, the plurality of audio stream fragments are formed into the complete audio stream; one or more models within the model service are updated using the complete audio stream.
Updating the model service is actually updating one or more models within the model service. According to the sequence of the model service receiving the service requests, the audio stream fragments are formed into the complete audio stream; or according to the identification information of the plurality of audio stream fragments, the plurality of audio stream fragments are formed into the complete audio stream.
It should be noted that, the embodiment of the present disclosure is a tangential flow scheme based on RPC (remote procedure call protocol), and the management center has an RPC module for implementing sending a service request.
In order to better understand the above technical solution, the embodiments of the present disclosure further provide an alternative embodiment, which is used to explain the above technical solution.
Fig. 3 schematically illustrates a schematic diagram of a method for updating a model service according to an embodiment of the disclosure, as shown in fig. 3:
and the management center sequentially sends a plurality of service requests belonging to the same session to the model service through the first channel under the condition that the model online request is not received.
Under the condition that a model online request is received, a second channel is established between the management center and the standby cache service, a plurality of business requests belonging to a current session are sent to the model service through the first channel, and the first channel is disconnected; and sending a plurality of business requests belonging to the call back after the current session to the standby cache service through the second channel.
Reestablishing the first channel between the management center and the model service after updating the model service using the complete audio stream; transmitting a plurality of service requests belonging to the current session to the standby cache service through the second channel, and disconnecting the second channel; and sending a plurality of business requests belonging to the call back after the current session to the model service through the first channel.
According to the method and the device, a service request is received through a management center, and is analyzed to obtain a request identifier, wherein the request identifier is used for indicating a session corresponding to the service request; determining a model service corresponding to the session according to the request identifier by using a consistent hash algorithm; transmitting a plurality of service requests belonging to the same session to the model service through a first channel, wherein each service request comprises an audio stream fragment; and forming a plurality of audio stream fragments into a complete audio stream, and updating the model service by using the complete audio stream. Because the embodiment of the disclosure can send the plurality of service requests belonging to the same session to the model service through the first channel, and form the plurality of audio stream fragments carried by the plurality of service requests into the complete audio stream, and update the model service by using the complete audio stream, the technical means can solve the problems that in the prior art, only the model service can be updated based on the service requests and the model service cannot be updated based on the session, and further provide a method for updating the model service based on the session dimension.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present disclosure may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as a Read-Only Memory (ROM), a random access Memory (RandomAccess Memory, RAM), a magnetic disk, an optical disc), which includes several instructions for causing a terminal device (which may be a mobile phone, a computer, a component server, or a network device, etc.) to perform the method of the embodiments of the present disclosure.
In this embodiment, a device for updating a model service is further provided, and the device for updating a model service is used to implement the foregoing embodiments and preferred embodiments, which are not described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
Fig. 4 schematically illustrates a block diagram of a model service updating apparatus according to an alternative embodiment of the present disclosure, as shown in fig. 4, the apparatus includes:
a receiving module 402, configured to receive a service request through a management center, and analyze the service request to obtain a request identifier, where the request identifier is used to indicate a session corresponding to the service request;
a determining module 404, configured to determine, according to the request identifier, a model service corresponding to the session using a consistent hashing algorithm;
a sending module 406, configured to send a plurality of service requests belonging to the same session to the model service through a first channel, where each service request includes an audio stream segment;
an updating module 408, configured to compose a plurality of audio stream segments into a complete audio stream, and update the model service using the complete audio stream.
According to the method and the device, a service request is received through a management center, and is analyzed to obtain a request identifier, wherein the request identifier is used for indicating a session corresponding to the service request; determining a model service corresponding to the session according to the request identifier by using a consistent hash algorithm; transmitting a plurality of service requests belonging to the same session to the model service through a first channel, wherein each service request comprises an audio stream fragment; and forming a plurality of audio stream fragments into a complete audio stream, and updating the model service by using the complete audio stream. Because the embodiment of the disclosure can send the plurality of service requests belonging to the same session to the model service through the first channel, and form the plurality of audio stream fragments carried by the plurality of service requests into the complete audio stream, and update the model service by using the complete audio stream, the technical means can solve the problems that in the prior art, only the model service can be updated based on the service requests and the model service cannot be updated based on the session, and further provide a method for updating the model service based on the session dimension.
Optionally, the sending module 406 is further configured to establish a second channel between the management center and a standby cache service when receiving a model online request, where the standby cache service has a correspondence with the model service; transmitting a plurality of service requests belonging to a current session to the model service through the first channel; and sending a plurality of business requests belonging to the call back after the current session to the standby cache service through the second channel.
It should be noted that, the model online request is used to indicate updating the model service, and the model service cannot receive information any more when the model service is updated, so when the online request is received, a second channel needs to be established between the management center and the standby cache service, and when the model service is updated, the information which should be originally sent to the model service is received through the standby cache service. The model service may be understood as an instance, each instance corresponding to one or a class of models. If the cluster mode deployment is performed, a plurality of instances corresponding to the model service exist. Since the service and the model are in a 1-to-1 relationship in the micro service domain, deploying the model is equivalent to updating the model resource file on the model service.
The algorithm model is generally updated iteratively with different products and projects, including various types of customized scenes. Different clients can have different requirements on model accuracy in different fields, so that in the iterative process of products or projects, the algorithm model needs to be continuously subjected to accuracy optimization and retraining, and a new model can be output after each optimization. New model files are already different from old models and need to be brought up to the production environment. The accuracy of the new model is higher, so that the model update in the embodiment of the disclosure can improve the accuracy of the algorithm.
In the prior art, only the model service can be updated based on the service request, which causes traffic loss.
Optionally, the sending module 406 is further configured to disconnect the first channel; and forming the plurality of audio stream fragments carried by the plurality of service requests of the current session into the complete audio stream, and updating the model service by using the complete audio stream.
According to the embodiment of the disclosure, after a plurality of service requests belonging to a current session are sent to the model service, the model service is updated according to the plurality of audio stream fragments carried by the plurality of service requests of the current session, so that the loss caused by updating the model service based on the service requests in the prior art can be avoided. Because in the prior art, the model service is updated based on the service request, the current session cannot be waited until all the requests are completed according to the current session, so that the subsequent information of the current session is lost, even if the subsequent service request can be normally sent, the current session is not complete, the subsequent obtained result cannot be associated with the current session, and the traffic of the current session is actually lost. The correct approach would be to switch the connection over to the backup caching service again when the current session is completely over, after which the session is not lost, either in its entirety or in its lossless nature, since it is already being processed on the backup caching service.
The processing capacity of the standby cache service and the processing capacity of the model service are the same, so that the standby cache service can completely meet the load requirement when the model service is updated. In order to ensure that the model service online process is performed smoothly, each model service can be attached with 1 standby cache service, which depends on the configuration of the physical machine or virtual machine where the model service instance is located.
When the model service is online, the standby cache service carries traffic that would otherwise be diverted to the top of the model service. Since the model service needs to be taken off line and restarted, which causes short-term unavailability of the model service, this portion of the traffic needs to be received and processed by the backup caching service at this time.
After the model service is updated, the traffic brought by the session which is not ended on the standby cache service is processed on the standby cache service, and then the traffic of the new request and the session is sent to the model service again and is not shunted to the standby cache service, so that the traffic of the standby cache service is slowly exhausted.
Optionally, the sending module 406 is further configured to reestablish the first channel between the management center and the model service; transmitting a plurality of service requests belonging to the current session to the standby cache service through the second channel; and sending a plurality of business requests belonging to the call back after the current session to the model service through the first channel.
After updating the model service, the model service may receive information again, at this time, reestablish the first channel, and send, through the first channel, a plurality of service requests pertaining to the call back after the current session to the model service, where the call back after the current session may be one session or a plurality of sessions, and at this time, the number of sessions sent is determined by the time required for updating the model service, that is, the session after the completion time of updating the model service is received by the model service. Because the disclosed embodiments are session-based, the session at the completion time of updating the model service is still received by the standby cache service.
The first channel means a connection channel established with the model service. Since the model needs to be disconnected when the model service is updated before, the original first channel connection is closed, and the first channel is destroyed at the same time. The meaning of the re-established first channel is that the management center re-establishes the channel with the model service after restoring the connection with the model service. The first channels, which are not identical, are recreated from the point of view of the connection, and the first channels, which are not identical in nature, do not belong to one connection.
Optionally, the sending module 406 is further configured to disconnect the second channel; and when the model service is updated next time, updating the model service according to the data in the standby cache service.
After updating the model service with the complete audio stream and after sending a plurality of service requests belonging to the current session to the backup caching service via the second channel, the second channel is disconnected. The data in the backup cache service may be used to update the model service the next time the model service is updated. The data in the standby cache service is a request or information accepted in the standby cache service.
The embodiment of the disclosure can realize the lossless online of the model service in the process of streaming end-to-end voice interaction.
Stream processing: typically the processing of an audio stream. For example, in a speech recognition scenario, real-time speech transcription text is returned while speaking. In this process, the requested audio needs to be sent in the form of an audio clip or an audio stream, and the server side returns the identification result of the current audio clip immediately after receiving the audio stream. In a speech recognition scenario, if only the currently requested audio clip is processed without combining the context and the context information, the recognition accuracy will be poor, so streaming needs to guarantee: 1) The requested data needs to be sent in a streaming mode; 2) The processing result of the current request data stream can be returned in real time immediately; 3) The processing result of the data stream needs to be cached in combination with the previously uploaded part, and more accurate result is given in combination with the context information.
Non-streaming processing: for example, to identify commodity information in a picture or to perform text transcription on a telephone recording file, in this process, the object of the request is "a picture" or "an audio file", and after the request is submitted, the algorithm model returns the processing result of the whole image or the voice transcription result of the whole audio at one time. In this process, we cannot see the state of this picture processing procedure, or the real-time state of converting the voice into the text (such as speaking and outputting the word in the voice-to-word process), and the corresponding result will not be returned until the request is completed, which is the non-streaming process.
Model online: the model is packaged, released and updated to an online environment, and at this time, the audio stream request being processed is influenced because the model file needs to be updated and reloaded, so that the audio stream identification result of the current request is not obtained.
Model lossless online: during the model online process, all traffic (including the task being processed and the results returned) will not have any impact.
When the model service is updated next time, updating the model service according to the data in the standby cache service comprises the following steps: the plurality of audio stream fragments carried by a plurality of service requests in the standby cache service are formed into the complete audio stream; the model service is updated using the complete audio stream.
Optionally, the determining module 404 is further configured to calculate a first hash value of the service request according to the request identifier; acquiring node information of a plurality of model services, and calculating second hash values of the plurality of model services according to the node information; and determining the model service corresponding to the session from the plurality of model services according to the first hash value and the second hash value.
The consistent hash algorithm is a special hash algorithm and is mainly used for solving the problem of uneven distribution of the hash algorithm and avoiding the problem of partial model service overload caused by uneven traffic load. According to the embodiment of the disclosure, a first hash value of the service request is calculated according to the request identifier, and the first hash value is mapped to 0-2 32 Is a circle; and calculating second hash values of the plurality of model services according to the node information, and mapping the second hash values to the circles. And on the circle, taking the first hash value as a starting point, searching a second hash value corresponding to the first hash value anticlockwise, and determining the model service corresponding to the session from the plurality of model services.
Optionally, the updating module 408 is further configured to compose the plurality of audio stream fragments into the complete audio stream according to the order in which the plurality of service requests are received by the model service or the identification information of the plurality of audio stream fragments; one or more models within the model service are updated using the complete audio stream.
Updating the model service is actually updating one or more models within the model service. According to the sequence of the model service receiving the service requests, the audio stream fragments are formed into the complete audio stream; or according to the identification information of the plurality of audio stream fragments, the plurality of audio stream fragments are formed into the complete audio stream.
It should be noted that, the embodiment of the present disclosure is a tangential flow scheme based on RPC (remote procedure call protocol), and the management center has an RPC module for implementing sending a service request.
It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by, but not limited to: the modules are all located in the same processor; alternatively, the above modules may be located in different processors in any combination.
Embodiments of the present disclosure provide an electronic device.
Fig. 5 schematically illustrates a block diagram of an electronic device according to an embodiment of the disclosure.
Referring to fig. 5, an electronic device 500 provided in an embodiment of the present disclosure includes a processor 501, a communication interface 502, a memory 503, and a communication bus 504, where the processor 501, the communication interface 502, and the memory 503 complete communication with each other through the communication bus 504; a memory 503 for storing a computer program; the processor 501 is configured to implement the steps in any of the method embodiments described above when executing a program stored on a memory.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, where the input/output device is connected to the processor.
Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:
s1, receiving a service request through a management center, and analyzing the service request to obtain a request identifier, wherein the request identifier is used for indicating a session corresponding to the service request;
s2, determining a model service corresponding to the session according to the request identifier by using a consistent hash algorithm;
s3, sending a plurality of service requests belonging to the same session to the model service through a first channel, wherein each service request comprises an audio stream fragment;
s4, forming a plurality of audio stream fragments into a complete audio stream, and updating the model service by using the complete audio stream.
Embodiments of the present disclosure also provide a computer-readable storage medium. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps of any of the method embodiments described above.
Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of:
s1, receiving a service request through a management center, and analyzing the service request to obtain a request identifier, wherein the request identifier is used for indicating a session corresponding to the service request;
s2, determining a model service corresponding to the session according to the request identifier by using a consistent hash algorithm;
s3, sending a plurality of service requests belonging to the same session to the model service through a first channel, wherein each service request comprises an audio stream fragment;
s4, forming a plurality of audio stream fragments into a complete audio stream, and updating the model service by using the complete audio stream.
The computer-readable storage medium may be embodied in the apparatus/means described in the above embodiments; or may exist alone without being assembled into the apparatus/device. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
Alternatively, specific examples in this embodiment may refer to examples described in the foregoing embodiments and optional implementations, and this embodiment is not described herein.
It will be appreciated by those skilled in the art that the modules or steps of the disclosure described above may be implemented in a general purpose computing device, they may be centralized on a single computing device, or distributed across a network of computing devices, or they may alternatively be implemented in program code executable by computing devices, such that they may be stored in a memory device for execution by the computing devices and, in some cases, the steps shown or described may be performed in a different order than what is shown or described, or they may be implemented as individual integrated circuit modules, or as individual integrated circuit modules. As such, the present disclosure is not limited to any specific combination of hardware and software.
The foregoing description of the preferred embodiments of the present disclosure is merely illustrative of and not intended to be limiting of the present disclosure, as various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principles of the present disclosure should be included in the protection scope of the present disclosure.

Claims (10)

1. A method for updating a model service, comprising:
receiving a service request through a management center, and analyzing the service request to obtain a request identifier, wherein the request identifier is used for indicating a session corresponding to the service request;
determining a model service corresponding to the session according to the request identifier by using a consistent hash algorithm;
transmitting a plurality of service requests belonging to the same session to the model service through a first channel, wherein each service request comprises an audio stream fragment;
forming a plurality of audio stream fragments into a complete audio stream, and updating the model service by using the complete audio stream;
the model service is used for updating when a model online request is received, and the complete audio stream is used for updating the model service, so that the flow or information of the session is in a lossless state in the process of online of the model service.
2. The method of claim 1, wherein the sending the plurality of business requests belonging to the same session to the model service via the first channel comprises:
under the condition that a model online request is received, a second channel is established between the management center and a standby cache service, wherein the standby cache service has a corresponding relation with the model service;
Transmitting a plurality of service requests belonging to a current session to the model service through the first channel;
and sending a plurality of business requests belonging to the call back after the current session to the standby cache service through the second channel.
3. The method of claim 2, wherein after the sending the plurality of business requests belonging to the current session to the model service via the first channel, the method further comprises:
disconnecting the first channel;
and forming the plurality of audio stream fragments carried by the plurality of service requests of the current session into the complete audio stream, and updating the model service by using the complete audio stream.
4. The method of claim 3, wherein after the updating the model service using the complete audio stream, the method further comprises:
reestablishing the first channel between the management center and the model service;
transmitting a plurality of service requests belonging to the current session to the standby cache service through the second channel;
and sending a plurality of business requests belonging to the call back after the current session to the model service through the first channel.
5. The method of claim 4, wherein after the sending the plurality of service requests belonging to the current session to the standby cache service through the second channel, the method further comprises:
disconnecting the second channel;
and when the model service is updated next time, updating the model service according to the data in the standby cache service.
6. The method of claim 5, wherein the next time the model service is updated, updating the model service based on the data in the backup cache service comprises:
the plurality of audio stream fragments carried by a plurality of service requests in the standby cache service are formed into the complete audio stream;
the model service is updated using the complete audio stream.
7. The method according to claim 1, wherein the determining, using a consistent hashing algorithm, a model service corresponding to the session according to the request identifier includes:
calculating a first hash value of the service request according to the request identifier;
acquiring node information of a plurality of model services, and calculating second hash values of the plurality of model services according to the node information;
And determining the model service corresponding to the session from the plurality of model services according to the first hash value and the second hash value.
8. The method of claim 1, wherein the grouping the plurality of audio stream segments into a complete audio stream, and updating the model service using the complete audio stream, comprises:
according to the sequence of the plurality of service requests received by the model service or the identification information of the plurality of audio stream fragments, the plurality of audio stream fragments are formed into the complete audio stream;
one or more models within the model service are updated using the complete audio stream.
9. An updating apparatus for a model service, comprising:
the receiving module is used for receiving a service request through the management center, analyzing the service request to obtain a request identifier, wherein the request identifier is used for indicating a session corresponding to the service request;
the determining module is used for determining the model service corresponding to the session according to the request identifier by using a consistent hash algorithm;
a sending module, configured to send a plurality of service requests belonging to the same session to the model service through a first channel, where each service request includes an audio stream segment;
An updating module, configured to compose a plurality of audio stream fragments into a complete audio stream, and update the model service using the complete audio stream;
the model service is used for updating when a model online request is received, and the complete audio stream is used for updating the model service, so that the flow or information of the session is in a lossless state in the process of online of the model service.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the method of any one of claims 1 to 8.
CN202111081195.3A 2021-09-15 2021-09-15 Updating method and device of model service and storage medium Active CN113793612B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111081195.3A CN113793612B (en) 2021-09-15 2021-09-15 Updating method and device of model service and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111081195.3A CN113793612B (en) 2021-09-15 2021-09-15 Updating method and device of model service and storage medium

Publications (2)

Publication Number Publication Date
CN113793612A CN113793612A (en) 2021-12-14
CN113793612B true CN113793612B (en) 2024-04-09

Family

ID=78878395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111081195.3A Active CN113793612B (en) 2021-09-15 2021-09-15 Updating method and device of model service and storage medium

Country Status (1)

Country Link
CN (1) CN113793612B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2501105A1 (en) * 2011-03-14 2012-09-19 British Telecommunications Public Limited Company Service session resource management
CN103209223A (en) * 2013-04-27 2013-07-17 中国农业银行股份有限公司 Distributed application conversation information sharing method and system and application server
KR20180024841A (en) * 2016-08-31 2018-03-08 주식회사 구루미디어 Streaming service server and control method thereof
CN110839084A (en) * 2019-11-19 2020-02-25 中国建设银行股份有限公司 Session management method, device, equipment and medium
CN111209023A (en) * 2020-01-08 2020-05-29 百度在线网络技术(北京)有限公司 Skill service updating method and device, electronic equipment and readable storage medium
CN112231377A (en) * 2020-09-27 2021-01-15 云镝智慧科技有限公司 Data mapping method, system, device, server and storage medium
CN112667270A (en) * 2020-12-23 2021-04-16 科大讯飞股份有限公司 Voice processing resource updating method, computer equipment and storage device
CN112837683A (en) * 2020-12-31 2021-05-25 苏州思必驰信息科技有限公司 Voice service method and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160065677A1 (en) * 2014-09-02 2016-03-03 Palo Alto Research Center Incorporated System and method for a reliable content exchange of a ccn pipeline stream
US10324704B2 (en) * 2015-05-27 2019-06-18 Google Llc Online marketplace of plugins for enhancing dialog systems
US11567742B2 (en) * 2016-12-29 2023-01-31 Atlassian Pty Ltd. Method, apparatus, and computer program product for generating updated network application interfaces
KR102550932B1 (en) * 2017-12-29 2023-07-04 삼성전자주식회사 Method and apparatus for personalizing speech recognition model
US11551012B2 (en) * 2019-07-05 2023-01-10 Electronics And Telecommunications Research Institute Apparatus and method for providing personal assistant service based on automatic translation

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2501105A1 (en) * 2011-03-14 2012-09-19 British Telecommunications Public Limited Company Service session resource management
CN103209223A (en) * 2013-04-27 2013-07-17 中国农业银行股份有限公司 Distributed application conversation information sharing method and system and application server
KR20180024841A (en) * 2016-08-31 2018-03-08 주식회사 구루미디어 Streaming service server and control method thereof
CN110839084A (en) * 2019-11-19 2020-02-25 中国建设银行股份有限公司 Session management method, device, equipment and medium
CN111209023A (en) * 2020-01-08 2020-05-29 百度在线网络技术(北京)有限公司 Skill service updating method and device, electronic equipment and readable storage medium
CN112231377A (en) * 2020-09-27 2021-01-15 云镝智慧科技有限公司 Data mapping method, system, device, server and storage medium
CN112667270A (en) * 2020-12-23 2021-04-16 科大讯飞股份有限公司 Voice processing resource updating method, computer equipment and storage device
CN112837683A (en) * 2020-12-31 2021-05-25 苏州思必驰信息科技有限公司 Voice service method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于传感网与GIS的智慧排水管网系统;吴星星;计算机系统应用;第44页 *

Also Published As

Publication number Publication date
CN113793612A (en) 2021-12-14

Similar Documents

Publication Publication Date Title
US10298457B2 (en) Synchronization of configuration file of virtual application distribution chassis
CN108696899B (en) SIP message transmitting and receiving method and transmitting and receiving device
CN110545327B (en) Information pushing method and system
CN107690149B (en) Method for triggering network policy update, management function entity and core network equipment
CN112615753A (en) Link abnormity tracking method, first node, second node and link
CN106911769B (en) Cloud platform routing data processing method and physical server of cloud platform
CN113064707A (en) Message processing method, device and equipment
WO2017220021A1 (en) Short message processing method and apparatus
CN107959702B (en) Routing method and device
CN114095876A (en) Broadcast data processing method and device, storage medium and electronic device
CN113793612B (en) Updating method and device of model service and storage medium
CN113407823A (en) IOS message pushing method and system
CN106341433A (en) Service system, signal server and intercom server
CN108696855B (en) Information management method, session management function and mobility management function
CN113612811B (en) Method, system, equipment and medium for client mounting in multiple channels
CN116055403A (en) Message data transmission method, device and server
CN113553206B (en) Data event execution method and device, electronic equipment and computer readable medium
KR20160026138A (en) Rapid sync method for cloud file system and cloud file system using the same
CN112565337B (en) Request transmission method, server, client, system and electronic equipment
US9742927B2 (en) Online charging method for always on IP connectivity
CN113746851A (en) Agent system and method for supporting real-time analysis of GRPC request
CN116420363A (en) Methods, systems, and computer readable media for supporting migration of user profile and policy information
CN105516923B (en) A kind of method and OTA server of determining terminal device region position
WO2023005484A1 (en) User information registration method and apparatus
JP6298013B2 (en) Policy control system and policy control method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant