WO2021015651A1

WO2021015651A1 - Ims node, network node and methods in a communications network

Info

Publication number: WO2021015651A1
Application number: PCT/SE2019/050708
Authority: WO
Inventors: Ester Gonzalez De Langarica
Original assignee: Telefonaktiebolaget Lm Ericsson (Publ)
Priority date: 2019-07-23
Filing date: 2019-07-23
Publication date: 2021-01-28

Abstract

A method performed by an Internet Protocol Multimedia Subsystem, IMS, node in a communications network, for handling media from one or more participants in a media session is provided. The IMS node obtains (501), from a network node, a request to create one or more individual media channels, for each participant. The IMS node then creates (504) one or more individual media channels for each participant, and performs (505) the media session using the created one or more individual media channels.

Description

IMS NODE, NETWORK NODE AND METHODS IN A COMMUNICATIONS NETWORK

TECHNICAL FIELD

Embodiments herein relate to an Internet Protocol Multimedia Subsystem (IMS) node, a network node and methods performed therein. In particular, embodiments herein relate to handling media between participants in a media session.

BACKGROUND

Over-The-Top (OTT) services have been introduced in wireless communication networks allowing a third party telecommunications service provider to provide services that are delivered across an IP network. The IP network may e.g. be a public internet or cloud services delivered via a third party access network, as opposed to a carrier's own access network. OTT may refer to a variety of services including communications, such as e.g. voice and/or messaging, content, such as e.g. TV and/or music, and cloud-based offerings, such as e.g. computing and storage.

Traditional communication networks such as e.g. Internet Protocol Multimedia Subsystem (IMS) Networks are based on explicit Session Initiation Protocol (SIP) signaling methods. The IMS network typically requires a user to invoke various

communication services by using a keypad and/or screen of a user equipment (UE) such as a smart phone device. A further OTT service is a Digital Assistant (DA). The DA may perform tasks or services upon request from a user, and may be implemented in several ways.

A first way to implement the DA may be to provide the UE of the user with direct access to a network node controlled by a third party service provider comprising a DA platform. This may e.g. be done using a dedicated UE having access to the network node. This way of implementing the DA is commonly referred to as an OTT-controlled DA.

A further way to implement the DA is commonly referred to as an operator controlled DA. In an operator controlled DA, functionality such as e.g. keyword detection, request fulfillment and media handling may be contained within the domain of the operator referred to as operator domain. Thus, the operator controls the whole DA solution without the UE being impacted. A user of the UE may provide instructions, such as e.g. voice commands, to a core network node, such as e.g. an IMS node, of the operator. The voice command may e.g. be“Digital Assistant, I want a pizza”,“Digital Assistant, tell me how many devices are active right now”,“Digital Assistant, set-up a conference”, or“Digital Assistant, how much credit do I have?”. The core network node may detect a hot word, which may also be referred to as a keyword, indicating that the user is providing instructions to the DA and may forward the instructions to a network node controlled by a third party service provider, the network node may e.g. comprise a DA platform. The DA platform may e.g. be a bot, e.g. software program, of a company providing a certain service, such as e.g. a taxi service or a food delivery service. The instructions may be forwarded to the DA platform using e.g. a Session Initiation Protocol/ Real-time Transport Protocol (SIP/RTP). The DA platform may comprise certain functionality, such as e.g. Speech2Text, Identification of Intents & Entities and Control & Dispatch of Intents. The DA platform may then forward the instructions to a further network node, which may e.g. be an Application Server (AS) node, which has access to the core network node via an Application Programming Interface (API) denoted as a Service Exposure API. Thereby the DA may access the IMS node and perform services towards the core network node. The DA platform is often required to pay a fee to the operator in order to be reachable by the operator^'s DA users. The user may also be required to pay fees to the operator and network provider for the usage of DA services. The operator may further be required to pay fees to the network provider for every transaction performed via the Service Exposure API.

An operator controlled DA may be used in conjunction with an in-call service. As mentioned above, in the operator controlled DA model, the operator has full control of the media. This enables the implementation of in-call services, such as in-call translations. In such in-call services, the operator may listen to a conversation and perform the requested service, e.g., translate and/or transcribe what is said in the conversation. In the case of an in-call translation/transcription service, the operator listens to the conversation and translates and/or transcribes the incoming audio from the participants in the media session. The written transcript and translated content may then be continuously delivered to the users in real time.

The transcript and/or translated content may be delivered to the user in several ways, such as e.g. via messaging to each user or published on a web page where users can see both the transcript and the associated translation.

An in-call translation, in general, comprises two main relevant parts: (1) capturing and transcribing an audio input via a UE, i.e. through the microphone; and (2) performing the translation.

In an in-call service, such as a translation service, the challenge is often not in the translation itself, but in the capture of the audio. A translation service may misunderstand what is said due to, e.g., background noise, a person’s accent or articulation, and/or flaws in the speech recognition system. If the capture of the audio is flawed, it will most likely generate incorrect transcripts and consequent incorrect translations.

Acquiring a correct transcript and providing a correct translation is challenging due to several factors. One such factor is the human behavior. When humans are engaged in face to face conversations it can be appreciated by the participants when an interruption occurs, e.g. through non-verbal communication. In a media session, such as a telephone call, however, there is no non-verbal communication. Therefore it may be difficult for a participant to know if he/she is interrupting another participant. Thus, in media sessions, interruptions are more likely to occur, which may result in a mixed audio input being sent to the in-call service. Such a mixed audio may be difficult for the in-call service to transcribe properly, and the resulting transcription and/or translation is very likely to be erroneous.

Another factor affecting the audio capture is background noise. While a participant is speaking, and other participants do not mute their microphones, the background noise from the other participants will be added to the audio capture that is sent to the in-call service.

There are numerous in-call services that may be affected by inadequate audio uptake. In addition to a transcription/translation service mentioned above, call recordings are one such example, where the quality of the recording is directly related to the quality of the audio input. Furthermore, the quality of a call recording is significantly increased if, for example, a DA can identify the identity of a voice owner.

SUMMARY

The quality of in-call services, such as in-call translation services, is highly dependent on the quality of the captured audio. Furthermore, in-call services may be very complex. When an audio input comes in a single channel with all participants’ audio mixed, it is difficult for the Digital Assistant to identify the source of the audio, which may lead to a lower quality of e.g. the translation. Thus, a single media channel comprising audio input from multiple participants is not an ideal solution.

An object of embodiments herein is therefore to improve the performance of in-call services such as in-call translations, in a communication network.

According to an aspect of embodiments herein, the object is achieved by a method performed by an IMS node in a communications network, for handling media from one or more participants in a media session. The IMS node obtains, from a network node, a request to create one or more individual media channels, for each participant. The IMS node then creates, one or more individual media channels for each participant, and performs the media session using the created one or more individual media channels.

According to another aspect of embodiments herein, the object is achieved by a method performed by a network node in a communications network, for handling media from one or more participants in a media session. The network node obtains a request to start an in-call service, and sends, to the IMS, node, a request to create one or more individual media channels, for each participant.

According to a further aspect of embodiments herein, the object is achieved by an IMS node in a communications network, configured to handle media between participants in a media session. The IMS node is further configured to obtain, from a network node, a request to create individual media channels for the media session. The IMS is further configured to create individual media channels for the media session, at least one individual media channel for each participant, and perform the media session using the created individual media channels.

According to a yet further aspect of embodiments herein, the object is achieved by a network node in a communications network configured to handle media between participants in a media session. The network node is further configured to obtain a request to start an in-call service and send, to an IMS node, a request to create individual media channels for the media session, at least one individual media channel for each participant.

The performance of in-call services, such as in-call translations in a communications network, may be improved according to the embodiments above, e.g. since individual media channels are created for each participant thus improving e.g. quality of the audio and hence also transcripts of the audio.

BRIEF DESCRIPTION OF THE DRAWINGS

Examples of embodiments herein are described in more detail with reference to attached drawings in which:

Figure 1 is a schematic diagram illustrating an operator controlled DA.

Figure 2 is a schematic diagram illustrating embodiments of a communications network. Figure 3 is a schematic overview depicting embodiments of a method in the

communications network.

Figure 4 is a signalling diagram depicting embodiments of a method in the

communications network. Figure 5 is a flowchart depicting embodiments of a method in an IMS node.

Figure 6 is a flowchart depicting embodiments of a method in a network node.

Figure 7 is a schematic diagram illustrating embodiments of an IMS node.

Figure 8 is a schematic diagram illustrating embodiments of a network node.

Figure 9 schematically illustrates a telecommunications network connected via an

intermediate network to a host computer.

Figure 10 is a generalized block diagram of a host computer communicating via a base station with a user equipment over a partially wireless connection.

Figures 11 to 14 are flowcharts illustrating methods implemented in a communication system including a host computer, a base station and a UE.

DETAILED DESCRIPTION

Embodiments herein relate to a mechanism for allowing the creation and deletion of media channels, e.g. carrying audio, from all participants involved in a media session, such as a conference call, as well as from an operator controlled DA.

In particular, embodiments herein facilitate services such as in-call translations, recordings and call transcripts, since it enables handling media from different call participants individually.

Furthermore, such a mechanism allows a DA to listen to additional requests from the participants while at the same time e.g. translating, transcribing and/or recording the conversation.

In order to expose such a mechanism, a new Application Program Interface (API) may e.g. be provided by the IMS network. Figure 1 depicts the fundamentals of an operator controlled DA, herein referred to as a DA. In Figure 1 , a first and a second UE user, A and B, are connected to a DA platform node via the IMS CN. The communication between the entities may be performed with Voice over IP (VoIP) communication, through Session Initiation Protocol (SIP) and Real Time Protocol (RTP) signalling methods. The DA platform node may in turn be connected to entities in a third party domain, such as databases and cloud based services.

Any user or participant involved in a media session, i.e. both the users A and B as depicted in Figure 1 , may engage an in-call service such as a translation service through the use of the DA. The user A may in such a scenario e.g. say Operator, translate this call”. The DA may then, in response, activate an in-call translation service which may e.g. be provided via a cloud based service.

The users A and B may each be associated with a respective UE: a first UE 121 and a second UE 122 respectively. The first UE 121 and the second UE 122 provide an interface so that the users of the UEs can convey information to the DA and to any other participant in the media session.

In a scenario when a DA has been engaged to activate an in-call translation service, the DA is in full control of the media in the media session and, accordingly, of the transcriptions and translations that are taking place during the course of the media session. The translation service may be deactivated, via the DA, at any time by any of the participants in the media session.

As described above, a problem with in-call translation services may be that the audio input is flawed. Therefore, the interface on the respective UE is important for the participants in the media session in order for them to be able to see if a spoken sentence has been correctly captured by the DA.

Thus, it may be useful for the users A and B depicted in Figure 1 to each receive a transcript of the audio in the media session to their respective UEs 121 , 122.

Figure 2 is a schematic overview depicting a communications network 100 wherein embodiments herein may be implemented. The communications network 100 comprises one or more RANs and one or more CNs. The communications network 100 may use any technology such as 5G new radio (NR) but may further use a number of other different technologies, such as, Wi-Fi, long term evolution (LTE), LTE-Advanced, wideband code division multiple access (WCDMA), global system for mobile

communications/enhanced data rate for GSM evolution (GSM/EDGE), worldwide interoperability for microwave access (WiMax), or ultra mobile broadband (UMB), just to mention a few possible implementations.

One or more network nodes 140 operate in the communications network 100.

Such a network node may be a cloud based server or an application server providing processing capacity for, e.g., managing a DA, handling conferencing, and handling translations in an ongoing media session between participants. The network nodes may e.g. comprise a first network node 141 and a second network node 142.

Nodes in an IMS network, such as an IMS node 150 also operate in the

communications network 100. The IMS node 150 may e.g. be comprised in the CN. The IMS node 150 may be connected to one or more of the network nodes 140. The IMS node 150 may e.g. be connected to the first network node 141. The first network node 141 may e.g. be represented by an Application Server (AS) node or a DA platform node. The first network node 141 may be located in a cloud 101 as depicted in Figure 1 , in the CN or in a Third Party domain of the communications network 100. The first network node 141 may act as a gateway to the second network node 142, which may e.g. be represented by an Application Server (AS) node or a platform node, located in the cloud 101 or in a Third Party domain of the communications network 100. Furthermore, the IMS node 150, the first network node 141 and the second network node 142 may be collocated nodes, stand-alone nodes or distributed nodes comprised fully or partly in the cloud 101.

The methods according to some embodiments herein are performed by the one or more network nodes 140, which may comprise the first network node 141 and the second network node 142.

The communications network 100 may further comprise one or more radio network nodes 110 providing radio coverage over a respective geographical area by means of antennas or similar. The geographical area may be referred to as a cell, a service area, beam or a group of beams. The radio network node 110 may be a transmission and reception point e.g. a radio access network node such as a base station, e.g. a radio base station such as a NodeB, an evolved Node B (eNB, eNode B), an NR Node B (gNB), a base transceiver station, a radio remote unit, an Access Point Base Station, a base station router, a transmission arrangement of a radio base station, a stand-alone access point, a Wireless Local Area Network (WLAN) access point, an Access Point Station (AP STA), an access controller, a UE acting as an access point or a peer in a Mobile device to Mobile device (D2D) communication, or any other network unit capable of communicating with a UE within the cell served by the radio network node 110 depending e.g. on the radio access technology and terminology used.

UEs such as the first UE 121 and the second UE 122 operate in the

communications network 100. The first UE 121 and the second UE 122 may e.g. each be a mobile station, a non-access point (non-AP) STA, a STA, a user equipment (UE) and/or a wireless terminals, an NB-internet of things (loT) mobile device, a W-Fi mobile device, an LTE mobile device and an NR mobile device communicate via one or more Access Networks (AN), e.g. RAN, to one or more core networks (CN). It should be understood by those skilled in the art that“UE” is a non-limiting term which means any terminal, wireless communication terminal, wireless mobile device, device to device (D2D) terminal, or node e.g. smart phone, laptop, mobile phone, sensor, relay, mobile tablets, television units or even a small base station communicating within a cell.

The methods according to some embodiments herein are performed by the first UE 121 , and the methods according to some other embodiments are performed by the second UE 122.

It should be noted that although terminology from 3GPP LTE has been used in this disclosure to exemplify the embodiments herein, this should not be seen as limiting the scope of the embodiments herein to only the aforementioned system. Other wireless or wireline systems, including WCDMA, WiMax, UMB, GSM network, any 3GPP cellular network or any cellular network or system, may also benefit from exploiting the ideas covered within this disclosure.

An example of embodiments herein is depicted in Fig. 3 and will be explained by means of the following example scenario.

In the example scenario depicted in Fig. 3, user A and user B are engaged in a media session, such as an audio conferencing session. Thus, in the media session, there are two participants, user A and user B, each associated with a respective UE. The user A is associated with the first UE 121 and the user B is associated with the second UE 122.

The first UE 121 and the second UE 122 are connected to the network node 140 via the IMS node 150. The network node 140 may be a distributed node comprising the first network node 141 and the second network node 142. The first network node 141 may be a DA platform node and the second network node 142 may be a Third Party in-call service provider.

In the example scenario depicted in Fig. 3, the network node 140 comprises at least the first network node 141 , which is a DA platform node. The first network node 141 may be connected to, or comprise, an in-call translation service.

The participants, i.e. the user A and the user B, in the media session in Fig. 3 want to activate an in-call translation service. Either of the users A and B may start the in-call translation service by e.g. giving a voice command to the DA. Such a voice command may comprise a keyword, such as“operator”, and an intent, such as“start translation”. Thus, the user A, for example, may start the in-call translation by saying“Operator, translate the call!”.

As mentioned above, in order for the in-call translation service to work efficiently, the quality of the audio input from the participants in the media session should be as good as possible. The network node 140 is aware of the number of participants in the media session, i.e. two in the example in Fig. 3, and may therefore request that the IMS network creates individual media channels for each participant in the media session. The network node 140 may further require that a separate channel is created for audio input to, and audio output from, the DA.

In the media session, audio input from the user A and the user B are provided to the IMS network, through the IMS node 150. This is illustrate in Fig. 3 by means of the arrows indicating the source and direction of the audio.

The IMS node 150 may comprise a Virtual Multimedia Resource Function (vMRF), which may be suitable for the task of creating individual media channels for the audio in the media session. The vMRF may be used to mix and process media streams and may be controlled by the Multimedia Telephony Application Server (MTAS), using, for example, an H.248 control protocol.

Amongst other functionality, the vMRF supports playing tones and audio

announcements, as well as handling audio conferencing, detection of Dual Tone Multi Frequency (DTMF) tones, and DTMF tone forwarding for audio conferencing.

When audio conferencing is handled by the vMRF, it may be specified for each participant in the media session whether the connection is unidirectional, i.e. allowing listening only, or bidirectional. The audio conferencing feature in the vMRF may dynamically mix audio from several participants in the media session. The audio from an individual participant may be mixed to the audio conference based on active speaker logic. Each participant may have its own codec in use, and the audio conferencing feature in the vMRF may provide transcoding between the participants. Audio mixing may e.g. be performed at 16 kHz sampling rate, which may enable wideband conferencing for wideband-capable UEs.

The vMRF supports at least the following audio codecs: linear Pulse Code

Modulation (PCM); Enhanced Voice Services (EVS); Adaptive Multi-Rate audio codec (AMR-NB); Adaptive Multi-Rate Wideband (AMR-WB); and International

Telecommunication Union (ITU) standards such as G.711 , G.722 and G.729.

The vMRF is a Virtualized Network Function (VNF) and a single VNF may contain multiple Virtual Machines (VMs). The vMRF VNF may be deployed in the communications network 100 multiple times, for example as a separate VNF each time. In the vMRF VNF, each VM may provide various functions, such as a Payload (PL) function and a System Controller (SC) function. Accordingly, in the example scenario in Fig. 3, when the network node 140 requests that individual media channels should be set up for the two participants and the DA in the media session, and the vMRF in the IMS node 150 may suitably be charged with fulfilling this request.

Thus, a request may be sent from the network node 140 to the IMS node 150 to create at least as many individual media channels as the number of call participants and at least one individual media channel for the DA. However, in order for such a request to be fulfilled, a function that allows media to be delivered and processed individually may be required. Thus, to allow the creation and deletion of audio channels, a mechanism such as an Application Program Interface (API) may be employed. In embodiments herein, a new API is therefore provided. In the example in 3, such a new API is exposed to the IMS network by the network node 140. In the case of the in-call service being an in-call translation service, this API may e.g. be expressed as

Create audio channels for interpreter (sessionld) .

The IMS node 150 may thereby receive an API invocation from the network node 140 and may negotiate the media session by indicating as many media descriptors as participants in the conference. Negotiate, when used herein, means that the current session may have one media descriptor to indicate the content of the session, such as e.g. audio only. After negotiating once again, also referred to as re- negotiating wherein the IMS node and network node agree on number of media descriptors, the IMS node may e.g. instead indicate two media descriptors for the media session, such as audio and video.

In the IMS node 150, a suitable function, such as the vMRF, may then separate the incoming audio. Referring again to the example depicted in Fig. 3, the incoming audio from the user A is separated from the incoming audio from the user B. As is shown in Fig. 3, the audio input from the user A is provided from the IMS node 150 to the network node 140 in a separate channel (indicated by the arrow) and the audio input from the user B is provided from the IMS node 150 to the network node 140 in another channel. The audio intended for the DA, such as voice commands, is provided to the network node 140 in yet another channel.

In the media session in Fig. 3, the users A and B are engaged in the media session which is being translated. Thus, when something is spoken, the audio input is provided from the one or more UEs associated with one or more speaking participants to the IMS node 150, and then to the network node 140. The audio is separated in channels so as to increase the precision of the audio intended for translation. Having received the audio input, the network node 140 performs the translation, and/or any other tasks associated with in-call services that the participants in the media session have activated.

The translated audio may thereafter be provided from the network node 140 to the IMS node 150 in separated channels; one channel comprising the translated audio from the user A, and one channel comprising the translated audio from the user B. In Fig. 3, the translated audio is referred to as“in-call service audio”, since the translation service is merely one of many in-call services that are compatible with, and may benefit from, the features of embodiments herein.

A suitable function in the IMS node 150, such as the vMRF, mixes the audio and sends an output to the participants in the media session. In the example scenario, the user A may receive as output from the IMS node 150 a channel with mixed audio comprising:

1. The original audio input from the user B;

2. the in-call service audio from the network node 140, i.e. the translated audio from the user B in Fig. 3; and

3. any audio from the DA, e.g. with information such as“this call is being

translated”.

Correspondingly, user B may receive a channel with mixed audio comprising:

1. The original audio input from the user A;

2. the in-call service audio from the network node 140, i.e. the translated audio from the user A in Fig. 3; and

3. any audio from the DA, e.g. with information such as“this call is being

translated”.

Furthermore, the DA, receives, in a separated channel from the IMS node 150, all audio that is preceded by a predefined keyword, such as“Operator”. As mentioned above, the functionality of the DA is located in the network node 140.

The methods herein will first be described from a helicopter perspective as a signaling diagram showing involved nodes, such as the IMS node 150 and the network node 140 and with reference to Fig. 4. Thereafter embodiments of the methods as seen from the perspective of the IMS node 150 and the network node 140, respectively, will be individually described, briefly, one by one with reference to Figures 5 and 6. An example of embodiments herein is depicted in Fig. 4 and will be explained by means of the following example scenario.

In the example in Fig. 4, participants are engaged in a media session. The media session may for example be a conference call between multiple participants.

Action 401. Using a DA, such as an operator controlled DA, any of the participants in the media session may activate an in-call service by e.g. saying:”DA, start in-call service”. This request, in the form of a voice command is given to a UE associated with that participant. The request is then sent from the first UE 121 associated with the participant to the IMS node 150.

Action 402. The IMS node 150 forwards the request to start the in-call service to the network node 140. This Action relates to Action 601 described below.

Action 403. Having received the request, the network node 140 starts the in-call service. As mentioned above, the in-call service may be comprised in the DA platform or in a Third Party node.

Action 404. In order to successfully execute the in-call service, the network node 140 sends a request to the IMS node 150 to create individual media channels. The media channels may e.g. be bi-directional audio channels. This Action relates to Actions 501 and 602 described below.

Action 405. As mentioned above, the network node 140 may expose an API to the IMS network by sending an API invocation to the IMS node 150. This Action relates to Actions 502 and 603 described below.

Action 406. Upon receiving the request to create individual media channels, the IMS node 150 determines the number of participants in the media session. The number of participants in the media session is important for the IMS node 150 to know since it affects the required number of individual media channels that are to be created. The required number of individual channels may be, in a standard scenario, the number of participants, plus one channel for the DA. In the example described above referring to Fig. 3, the number of participants was two, and therefore, in a standard scenario of

embodiments herein, the IMS node may create three channels, i.e. one for each participant and one for the DA. This Action relates to Action 503 described below.

Action 407. Once the number of participants, and hereby the required number of media channels, has been determined, the IMS node negotiates the media session.

Negotiating, which may also be expressed as renegotiating, comprises the IMS node 150 indicating as many media descriptors as participants in the media session. The negotiation may e.g. take place between two or more SIP endpoints. In the example scenario, the IMS node 150 may negotiate with the network node 140. This Action relates to Action 504 described below.

Action 408. Having determined the number of individual media channels required for the in-call service, and having negotiated the media session, the IMS node 150 creates the required individual media channels. A functionality such as the vMRF in the IMS node 150 may be tasked to perform this, as explained above concerning the example in Fig. 3. This Action relates to Action 505 described below.

Action 409. The IMS node 150 is responsible for performing the media session. Action 410. The IMS node may then receive input from the participants in the media session.

Action 411. The IMS node 150 may also provide input to the network node 140.

Action 412. The network node 140 may then perform the in-call service.

Action 413. The output from the in-call service is provided from the network node 140 to the IMS node 150.

Action 414. The IMS node 150 may then forward the output to the participants in the media session.

The Actions 409-414 relate to Actions 506, 604 and 605 described below.

Action 415. When the in-call service is no longer required, any participant may initiate a termination of the in-call service. This may e.g. be done through the use of a voice command to the DA, such as by saying:”DA, stop the in-call service”. The IMS node 150 receives this request from the participants.

Action 416. The IMS node then forwards the request to the network node 140.

Action 417. The network node 140 then terminates the in-call service. This Action relates to Action 606 described below.

Action 418. If the created individual media channels are no longer needed, the network node 140 sends a request to the IMS node 150 to remove the created individual media channels. This Action relates to Action 607 described below.

Action 419. Upon receiving the request, the IMS node 150 removes the created individual media channels. Removing the created channels may e.g. be done by means of the API exposed to the IMS node 150, for creating and removing media channels, as explained above. This Action relates to Action 507 described below.

Example embodiments of a method performed by the IMS node 150 in the communications network 100, for handling media between participants in a media session, will now be described with reference to a flowchart depicted in Figure 5. The IMS node 150 may be represented by a Virtual Multimedia Resource Function, vMRF, node. The method comprises the following actions, which actions may be taken in any suitable order. Actions that are optional are presented in dashed boxes in Figure 5.

Action 501. The IMS node 150 obtains, from the network node 140, a request to create individual media channels for the media session. This Action relates to Action 404 described above and Action 602 described below.

Action 502. The IMS node 150 may further receive, from the network node 140, an API invocation for creating and removing individual media channels. This Action relates to Action 405 described above and Action 603 described below.

Action 503. The IMS node 150 may determine the number of participants in the media session, and thereby create at least one individual media channel for each determined participant. This Action relates to Action 406 described above.

Action 504. The IMS node 150 may negotiate the media session by indicating a number of media descriptors corresponding to a number of participants in the media session. This Action relates to Action 407 described above.

Action 505. The IMS node 150 creates individual media channels for the media session, at least one individual media channel for each participant. This Action relates to Action 408 described above. The individual media channels may be bi-directional audio channels associated with one of the participants in the media session or the DA.

Action 506. The IMS node 150 performs the media session using the created individual media channels. This Action relates to Action 410 described above.

Action 507. The IMS node 150 may remove the created individual media channels. This Action relates to Action 419 described above. Removing the created individual media channels may be performed upon request from one of the participants in the media session, via the network node 140.

Example embodiments of a method performed by the network node 140 in the

communications network 100, for handling media between participants in the media session, will now be described with reference to a flowchart depicted in Fig. 6. The method comprises the following actions, which actions may be taken in any suitable order. Actions that are optional are presented in dashed boxes in Fig. 6. The network node 140 may be represented by a distributed node comprising the first network node 141 and the second network node 142. The first network node 141 may be an operator controlled DA platform node. The second network node 142 may comprise an in-call service. Action 601. The network node 140 obtains the request to start an in-call service, which may e.g. be an in-call translation service. E.g. the network node receives this request from the IMS node 150, upon the IMS node 150 having received the same request from a participant in the media session. This Action relates to Actions 401 and 402 described above.

Action 602. The network node 140 sends, to the IMS node 150, another request to create individual media channels for the media session, at least one individual media channel for each participant. The media session may comprise at least two participants and the input for the requested in-call service may be given as individual audio associated with one of the at least two participants. According to some embodiments, the individual media channels may further be bi-directional audio channels associated with at least one of the participants in the media session or an operator-controlled DA. This Action relates to Actions 404 and 501 described above.

Action 603. According to some embodiments, the network node 140 may further expose an API to the IMS node 150, for creating and removing individual media channels. This Action relates to Actions 405 and 502 described above.

Action 604. According to some embodiments, the network node 140 may further perform the requested in-call service. This Action relates to Action 412 described above.

Action 605. According to some embodiments, the network node 140 may further provide the requested in-call service to the participants in the media session. This Action relates to Action 413 described above.

Action 606. According to some embodiments, the network node 140 may further terminate the requested in-call service, upon request from one of the participants in the media session. This Action relates to Action 417 described above.

Action 607. According to some embodiments, the network node 140 may further send a terminating request to the IMS node 150 to remove one or more of the created individual media channels. This Action relates to Action 418 described above.

Embodiments herein such as those mentioned above will now be further described and exemplified. The text below is applicable to, and may be combined with, any suitable embodiment described above.

To perform the method actions above for handling handle media between participants in a media session, the IMS node 150 may comprise the arrangement depicted in Fig. 7. Fig. 7 is a block diagram depicting the IMS node 150 in two embodiments configured to operate in the communications network 100, wherein the communications network 100 comprises the network node 140. The IMS node 150 may be used for handling media between participants in a media session, e.g. by creating individual media channels for the participants in the media session. The IMS node 150 may be represented by a vMRF node. The IMS node 150 may comprise a processing circuitry 760 e.g. one or more processors, configured to perform the methods herein.

The IMS node 150 may comprise a communication interface 700 depicted in Fig.

7, configured to communicate e.g. with the network node 140. The communication interface 700 may comprise a transceiver, a receiver, a transmitter, and/or one or more antennas.

The IMS node 150 may comprise an obtaining unit 701 , e.g. a receiver, transceiver or retrieving module. The IMS node 150, the processing circuitry 760, and/or the obtaining unit 701 may be configured to obtain from the network node 140, the request to create individual media channels for the media session.

The IMS node 150 may comprise an receiving unit 702, e.g. a receiver, transceiver or retrieving module. The IMS node 150, the processing circuitry 760, and/or the receiving unit 702 may be configured to receive, from the network node 140, the API invocation for creating and removing individual media channels.

The IMS node 150 may comprise an determining unit and/or module 703. The IMS node 150, the processing circuitry 760, and/or the determining unit 703 may be configured to determine the number of participants in the media session and create at least one individual media channel for each determined participant.

The IMS node 150 may comprise an negotiating unit and/or module 704. The IMS node 150, the processing circuitry 760, and/or the negotiating unit 704 may be configured to negotiate the media session by indicating the number of media descriptors corresponding to the number of participants in the media session.

The IMS node 150 may comprise an creating unit and/or module 705. The IMS node 150, the processing circuitry 760, and/or the creating unit 705 may be configured to create individual media channels for the media session, at least one individual media channel for each participant. The individual media channels created by the IMS node 150 may be bi-directional audio channels associated with one of the participants in the media session and/or the DA. The IMS node 150 may comprise an performing unit and/or module 706. The IMS node 150, the processing circuitry 760, and/or the performing unit 706 may be configured to perform the media session using the created individual media channels.

The first network node 141 may comprise a removing unit and/or module 707.

The IMS node 150, the processing circuitry 760, and/or the removing unit 707 may be configured to remove the created individual media channels. The IMS node 150, the processing circuitry 760, and/or the removing unit 707 may further be configured to remove the created individual media channels upon request from one of the participants in the media session, via the network node 140.

The IMS node 150 further comprises a memory 770. The memory comprises one or more units to be used to store data on, such as participant information, media descriptors, media channel setup data, applications to perform the methods disclosed herein when being executed, and similar.

The methods according to the embodiments described herein for the IMS node 150 are implemented by means of e.g. a computer program product 770 or a computer program, comprising instructions, i.e. , software code portions, which, when executed on at least one processor, cause the at least one processor to carry out the actions described herein, as performed by the IMS node 150. The computer program 770 may be stored on a computer-readable storage medium 790, e g. a disc or similar. The computer- readable storage medium 790, having stored thereon the computer program product, may comprise the instructions which, when executed on at least one processor, cause the at least one processor to carry out the actions described herein, as performed by the IMS node 150. In some embodiments, the computer-readable storage medium may be a non- transitory computer-readable storage medium.

To perform the method actions above for handling media between participants in a media session, the network node 140 may comprise the arrangement depicted in Fig. 8.

Fig. 8 is a block diagram depicting the network node 140 in two embodiments configured to operate in the communications network 100, wherein the communications network 100 comprises the first UE 121 and the second UE 122. The network node 140 may be used for handling media between participants in a media session, e.g. sending requests to the IMS node 150 in the communications network 100 to create individual media channels for the media session. The network node 140 may be represented by a distributed node comprising the first network node 141 and the second network node 142. The first network node 141 may be represented by an operator controlled DA platform node and the second network node 142 may be configured to comprise the in-call service. The in-call service may comprise an in-call translation service.

The network node 140 may comprise a processing circuitry 860 e.g. one or more processors, configured to perform the methods herein.

The network node 140 may comprise a communication interface 800 depicted in Fig. 8, configured to communicate e.g. with the first UE 121 and the second UE 122. The communication interface 800 may comprise a transceiver, a receiver, a transmitter, and/or one or more antennas.

The network node 140 may comprise an obtaining unit 801 , e.g. a receiver, transceiver or obtaining module. The network node 140, the processing circuitry 860, and/or the obtaining unit 801 may be configured to obtain the request to start an in-call service. The media session may comprise at least two participants, and the input for the requested in-call service may be given as individual audio associated with one of the at least two participants.

The network node 140 may comprise an sending unit 802, e.g. a transmitter, transceiver or sending module. The network node 140, the processing circuitry 860, and/or the sending unit 802 may be configured to send, to the IMS node 150, the other request to create individual media channels for the media session, at least one individual media channel for each participant. The network node 140 may be configured to request individual media channels that are bi-directional audio channels associated with at least one of the participants in the media session and/or an operator-controlled DA.

The network node 140 may comprise an exposing unit and/or module 803. The network node 140, the processing circuitry 860, and/or the exposing unit 803 may be configured to expose, to the IMS node 150, an API for creating and removing individual media channels.

The network node 140 may comprise an performing unit and/or module 804. The network node 140, the processing circuitry 860, and/or the performing unit 804 may be configured to perform the requested in-call service.

The network node 140 may comprise a providing unit 805, e.g. a transmitter, transceiver or providing module. The network node 140, the processing circuitry 860, and/or the providing unit 805 may be configured to provide the requested in-call service to the participants in the media session.

The network node 140 may comprise a terminating unit and/or module 806. The network node 140, the processing circuitry 860, and/or the terminating unit 806 may be configured to terminate the requested in-call service, upon request from one of the participants in the media session. The network node 140, the processing circuitry 860, and/or the terminating unit 806 may be configured to send to the IMS node 150, the terminating request to remove one or more of the created individual media channels.

The network node 140 further comprises a memory 870. The memory comprises one or more units to be used to store data on, such as media descriptors, requests, participant input, audio channels, DA features, applications to perform the methods disclosed herein when being executed, and similar.

The methods according to the embodiments described herein for the network node 140 are implemented by means of e.g. a computer program product 880 or a computer program, comprising instructions, i.e. , software code portions, which, when executed on at least one processor, cause the at least one processor to carry out the actions described herein, as performed by the network node 140. The computer program 880 may be stored on a computer-readable storage medium 890, e g. a disc or similar. The computer- readable storage medium 890, having stored thereon the computer program product, may comprise the instructions which, when executed on at least one processor, cause the at least one processor to carry out the actions described herein, as performed by the network node 140. In some embodiments, the computer-readable storage medium may be a non-transitory computer-readable storage medium.

With reference to Figure 9, in accordance with an embodiment, a

communication system includes a telecommunication network 3210 such as the wireless communications network 100, e.g. a NR network, such as a 3GPP-type cellular network, which comprises an access network 3211 , such as a radio access network, and a core network 3214. The access network 3211 comprises a plurality of base stations 3212a, 3212b, 3212c, such as the radio network node 110, access nodes, AP STAs NBs, eNBs, gNBs or other types of wireless access points, each defining a corresponding coverage area 3213a, 3213b, 3213c. Each base station

3212a, 3212b, 3212c is connectable to the core network 3214 over a wired or wireless connection 3215. A first user equipment (UE) e.g. the wireless devices 120 such as a Non-AP STA 3291 located in coverage area 3213c is configured to wirelessly connect to, or be paged by, the corresponding base station 3212c. A second UE 3292 e.g. the first or second radio node 110, 120 or such as a Non-AP STA in coverage area 3213a is wirelessly connectable to the corresponding base station 3212a. While a plurality of UEs 3291 , 3292 are illustrated in this example, the disclosed embodiments are equally applicable to a situation where a sole UE is in the coverage area or where a sole UE is connecting to the corresponding base station 3212.

The telecommunication network 3210 is itself connected to a host computer 3230, which may be embodied in the hardware and/or software of a standalone server, a cloud-implemented server, a distributed server or as processing resources in a server farm. The host computer 3230 may be under the ownership or control of a service provider, or may be operated by the service provider or on behalf of the service provider. The connections 3221 , 3222 between the telecommunication network 3210 and the host computer 3230 may extend directly from the core network 3214 to the host computer 3230 or may go via an optional intermediate network 3220. The intermediate network 3220 may be one of, or a combination of more than one of, a public, private or hosted network; the intermediate network 3220, if any, may be a backbone network or the Internet; in particular, the intermediate network 3220 may comprise two or more sub-networks (not shown).

The communication system of Figure 9 as a whole enables connectivity between one of the connected UEs 3291 , 3292 and the host computer 3230. The connectivity may be described as an over-the-top (OTT) connection 3250. The host computer 3230 and the connected UEs 3291 , 3292 are configured to communicate data and/or signaling via the OTT connection 3250, using the access network 3211 , the core network 3214, any intermediate network 3220 and possible further infrastructure (not shown) as intermediaries. The OTT connection 3250 may be transparent in the sense that the participating communication devices through which the OTT connection 3250 passes are unaware of routing of uplink and downlink communications. For example, a base station 3212 may not or need not be informed about the past routing of an incoming downlink communication with data originating from a host computer 3230 to be forwarded (e.g., handed over) to a connected UE 3291. Similarly, the base station 3212 need not be aware of the future routing of an outgoing uplink

communication originating from the UE 3291 towards the host computer 3230.

Example implementations, in accordance with an embodiment, of the UE, base station and host computer discussed in the preceding paragraphs will now be described with reference to Figure 10. In a communication system 3300, a host computer 3310 comprises hardware 3315 including a communication interface 3316 configured to set up and maintain a wired or wireless connection with an interface of a different communication device of the communication system 3300. The host computer 3310 further comprises processing circuitry 3318, which may have storage and/or processing capabilities. In particular, the processing circuitry 3318 may comprise one or more programmable processors, application-specific integrated circuits, field programmable gate arrays or combinations of these (not shown) adapted to execute instructions. The host computer 3310 further comprises software 3311 , which is stored in or accessible by the host computer 3310 and executable by the processing circuitry 3318. The software 3311 includes a host application 3312. The host application 3312 may be operable to provide a service to a remote user, such as a UE 3330 connecting via an OTT connection 3350 terminating at the UE 3330 and the host computer 3310. In providing the service to the remote user, the host application 3312 may provide user data which is transmitted using the OTT connection 3350.

The communication system 3300 further includes a base station 3320 provided in a telecommunication system and comprising hardware 3325 enabling it to communicate with the host computer 3310 and with the UE 3330. The hardware 3325 may include a communication interface 3326 for setting up and maintaining a wired or wireless connection with an interface of a different communication device of the communication system 3300, as well as a radio interface 3327 for setting up and maintaining at least a wireless connection 3370 with a UE 3330 located in a coverage area (not shown in Figure 10) served by the base station 3320. The communication interface 3326 may be configured to facilitate a connection 3360 to the host computer 3310. The connection 3360 may be direct or it may pass through a core network (not shown in Figure 10) of the telecommunication system and/or through one or more intermediate networks outside the telecommunication system. In the embodiment shown, the hardware 3325 of the base station 3320 further includes processing circuitry 3328, which may comprise one or more programmable processors, application-specific integrated circuits, field programmable gate arrays or combinations of these (not shown) adapted to execute instructions. The base station 3320 further has software 3321 stored internally or accessible via an external connection.

The communication system 3300 further includes the UE 3330 already referred to. Its hardware 3335 may include a radio interface 3337 configured to set up and maintain a wireless connection 3370 with a base station serving a coverage area in which the UE 3330 is currently located. The hardware 3335 of the UE 3330 further includes processing circuitry 3338, which may comprise one or more programmable processors, application-specific integrated circuits, field programmable gate arrays or combinations of these (not shown) adapted to execute instructions. The UE 3330 further comprises software 3331 , which is stored in or accessible by the UE 3330 and executable by the processing circuitry 3338. The software 3331 includes a client application 3332. The client application 3332 may be operable to provide a service to a human or non-human user via the UE 3330, with the support of the host computer 3310. In the host computer 3310, an executing host application 3312 may

communicate with the executing client application 3332 via the OTT connection 3350 terminating at the UE 3330 and the host computer 3310. In providing the service to the user, the client application 3332 may receive request data from the host application 3312 and provide user data in response to the request data. The OTT connection 3350 may transfer both the request data and the user data. The client application 3332 may interact with the user to generate the user data that it provides.

It is noted that the host computer 3310, base station 3320 and UE 3330 illustrated in Figure 10 may be identical to the host computer 3230, one of the base stations 3212a, 3212b, 3212c and one of the UEs 3291 , 3292 of Figure 9, respectively. This is to say, the inner workings of these entities may be as shown in Figure 10 and independently, the surrounding network topology may be that of Figure 9.

In Figure 10, the OTT connection 3350 has been drawn abstractly to illustrate the communication between the host computer 3310 and the use equipment 3330 via the base station 3320, without explicit reference to any intermediary devices and the precise routing of messages via these devices. Network infrastructure may determine the routing, which it may be configured to hide from the UE 3330 or from the service provider operating the host computer 3310, or both. While the OTT connection 3350 is active, the network infrastructure may further take decisions by which it dynamically changes the routing (e.g., on the basis of load balancing consideration or

reconfiguration of the network).

The wireless connection 3370 between the UE 3330 and the base station 3320 is in accordance with the teachings of the embodiments described throughout this disclosure. One or more of the various embodiments improve the performance of OTT services provided to the UE 3330 using the OTT connection 3350, in which the wireless connection 3370 forms the last segment. More precisely, the teachings of these embodiments may improve the call in service with benefits such as improved quality of e.g. transcripts and other services.

A measurement procedure may be provided for the purpose of monitoring data rate, latency and other factors on which the one or more embodiments improve. There may further be an optional network functionality for reconfiguring the OTT connection 3350 between the host computer 3310 and UE 3330, in response to variations in the measurement results. The measurement procedure and/or the network functionality for reconfiguring the OTT connection 3350 may be implemented in the software 3311 of the host computer 3310 or in the software 3331 of the UE 3330, or both. In

embodiments, sensors (not shown) may be deployed in or in association with communication devices through which the OTT connection 3350 passes; the sensors may participate in the measurement procedure by supplying values of the monitored quantities exemplified above, or supplying values of other physical quantities from which software 3311 , 3331 may compute or estimate the monitored quantities. The reconfiguring of the OTT connection 3350 may include message format, retransmission settings, preferred routing etc.; the reconfiguring need not affect the base station 3320, and it may be unknown or imperceptible to the base station 3320. Such procedures and functionalities may be known and practiced in the art. In certain embodiments, measurements may involve proprietary UE signaling facilitating the host computer’s 3310 measurements of throughput, propagation times, latency and the like. The measurements may be implemented in that the software 3311 , 3331 causes messages to be transmitted, in particular empty or‘dummy’ messages, using the OTT connection 3350 while it monitors propagation times, errors etc.

Figure 11 is a flowchart illustrating a method implemented in a communication system, in accordance with one embodiment. The communication system includes a host computer, a base station such as an AP STA, and a UE such as a Non-AP STA which may be those described with reference to Figure 9 and Figure 10. For simplicity of the present disclosure, only drawing references to Figure7 will be included in this section. In a first action 3410 of the method, the host computer provides user data. In an optional subaction 3411 of the first action 3410, the host computer provides the user data by executing a host application. In a second action 3420, the host computer initiates a transmission carrying the user data to the UE. In an optional third action 3430, the base station transmits to the UE the user data which was carried in the transmission that the host computer initiated, in accordance with the teachings of the embodiments described throughout this disclosure. In an optional fourth action 3440, the UE executes a client application associated with the host application executed by the host computer.

Figure 12 is a flowchart illustrating a method implemented in a communication system, in accordance with one embodiment. The communication system includes a host computer, a base station such as an AP STA, and a UE such as a Non-AP STA which may be those described with reference to Figure 9 and Figure 10. For simplicity of the present disclosure, only drawing references to Figure 12 will be included in this section. In a first action 3510 of the method, the host computer provides user data. In an optional subaction (not shown) the host computer provides the user data by executing a host application. In a second action 3520, the host computer initiates a transmission carrying the user data to the UE. The transmission may pass via the base station, in accordance with the teachings of the embodiments described throughout this disclosure. In an optional third action 3530, the UE receives the user data carried in the transmission.

Figure 13 is a flowchart illustrating a method implemented in a communication system, in accordance with one embodiment. The communication system includes a host computer, a base station such as an AP STA, and a UE such as a Non-AP STA which may be those described with reference to Figure 9 and Figure 10. For simplicity of the present disclosure, only drawing references to Figure 13 will be included in this section. In an optional first action 3610 of the method, the UE receives input data provided by the host computer. Additionally or alternatively, in an optional second action 3620, the UE provides user data. In an optional subaction 3621 of the second action 3620, the UE provides the user data by executing a client application. In a further optional subaction 3611 of the first action 3610, the UE executes a client application which provides the user data in reaction to the received input data provided by the host computer. In providing the user data, the executed client application may further consider user input received from the user. Regardless of the specific manner in which the user data was provided, the UE initiates, in an optional third subaction 3630, transmission of the user data to the host computer. In a fourth action 3640 of the method, the host computer receives the user data transmitted from the UE, in accordance with the teachings of the embodiments described throughout this disclosure.

Figure 14 is a flowchart illustrating a method implemented in a communication system, in accordance with one embodiment. The communication system includes a host computer, a base station such as an AP STA, and a UE such as a Non-AP STA which may be those described with reference to Figure 9 and Figure 10. For simplicity of the present disclosure, only drawing references to Figure 14 will be included in this section. In an optional first action 3710 of the method, in accordance with the teachings of the embodiments described throughout this disclosure, the base station receives user data from the UE. In an optional second action 3720, the base station initiates transmission of the received user data to the host computer. In a third action 3730, the host computer receives the user data carried in the transmission initiated by the base station.

When using the word "comprise" or“comprising” it shall be interpreted as non limiting, i.e. meaning "consist at least of'.As will be readily understood by those familiar with communications design, that functions, means, units, or modules may be implemented using digital logic and/or one or more microcontrollers, microprocessors, or other digital hardware. In some embodiments, several or all of the various functions may be implemented together, such as in a single application-specific integrated circuit (ASIC), or in two or more separate devices with appropriate hardware and/or software interfaces between them. Several of the functions may be implemented on a processor shared with other functional components of an intermediate network node, for example.

Alternatively, several of the functional elements of the processing circuitry discussed may be provided through the use of dedicated hardware, while others are provided with hardware for executing software, in association with the appropriate software or firmware. Thus, the term“processor” or“controller” as used herein does not exclusively refer to hardware capable of executing software and may implicitly include, without limitation, digital signal processor (DSP) hardware, read-only memory (ROM) for storing software, random-access memory for storing software and/or program or application data, and non-volatile memory. Other hardware, conventional and/or custom, may also be included. Designers of radio network nodes will appreciate the cost, performance, and maintenance trade-offs inherent in these design choices.

In some embodiments a non-limiting term“UE” is used. The UE herein may be any type of UE capable of communicating with network node or another UE over radio signals. The UE may also be a radio communication device, target device, device to device (D2D) UE, machine type UE or UE capable of machine to machine

communication (M2M), Internet of things (loT) operable device, a sensor equipped with UE, iPad, Tablet, mobile terminals, smart phone, laptop embedded equipped (LEE), laptop mounted equipment (LME), USB dongles, Customer Premises Equipment (CPE) etc.

Also in some embodiments generic terminology“network node”, is used. It may be any kind of network node which may comprise of a core network node, e.g., NOC node, Mobility Managing Entity (MME), Operation and Maintenance (O&M) node, Self- Organizing Network (SON) node, a coordinating node, controlling node, Minimizing Drive Test (MDT) node, etc.), or an external node (e.g., 3^rd party node, a node external to the current network), or even a radio network node such as base station, radio base station, base transceiver station, base station controller, network controller, evolved Node B (eNB), Node B, multi-RAT base station, Multi-cell/multicast Coordination Entity (MCE), relay node, access point, radio access point, Remote Radio Unit (RRU)

Remote Radio Head (RRH), etc.

The term“radio node” used herein may be used to denote the wireless device or the radio network node.

The term“signaling” used herein may comprise any of: high-layer signaling, e.g., via Radio Resource Control (RRC), lower-layer signaling, e.g., via a physical control channel or a broadcast channel, or a combination thereof. The signaling may be implicit or explicit. The signaling may further be unicast, multicast or broadcast. The signaling may also be directly to another node or via a third node.

The embodiments described herein may apply to any RAT or their evolution, e.g., LTE Frequency Duplex Division (FDD), LTE Time Duplex Division (TDD), LTE with frame structure 3 or unlicensed operation, UTRA, GSM, WiFi, short-range communication RAT, narrow band RAT, RAT for 5G, etc.

It will be appreciated that the foregoing description and the accompanying drawings represent non-limiting examples of the methods and apparatus taught herein. As such, the apparatus and techniques taught herein are not limited by the foregoing description and accompanying drawings. Instead, the embodiments herein are limited only by the following claims and their legal equivalents.

Claims

1. A method performed by an Internet Protocol Multimedia Subsystem, IMS, node (150) in a communications network (100), for handling media between participants in a media session, the method comprising:

- obtaining (501) from a network node (140), a request to create individual media channels for the media session;

- creating (505) individual media channels for the media session, at least one individual media channel for each participant; and

- performing (506) the media session using the created individual media channels.

2. The method according to claim 1 , further comprising:

- removing (507) the created individual media channels.

3. The method according to claim 2, wherein removing (507) the created individual media channels is performed upon request from one of the participants in the media session, via the network node (140).

4. The method according to any of the claims 1-3, further comprising:

- determining (503) a number of participants in the media session; and wherein creating (505) further comprises creating at least one individual media channel for each determined participant.

5. The method according to any of the claims 1-4, further comprising:

- receiving (502) from the network node (140) an Application Program

Interface, API, invocation for creating and removing individual media channels.

6. The method according to any of the claims 1-5 , further comprising:

- negotiating (504) the media session by indicating a number of media descriptors corresponding to a number of participants in the media session.

7. The method according to any of the claims 1-6, wherein the individual media channels are bi-directional audio channels associated with one of the participants in the media session and/or a Digital Assistant, DA.

8. The method according to any of the claims 1-7, wherein the IMS node (150) is a Virtual Multimedia Resource Function, vMRF, node.

9. A computer program comprising instructions, which when executed by a processor, causes the processor to perform actions according to any of the claims 1 to 8.

10. A carrier comprising the computer program of claim 9, wherein the carrier is one of an electronic signal, an optical signal, an electromagnetic signal, a magnetic signal, an electric signal, a radio signal, a microwave signal, or a computer-readable storage medium.

1 1. A method performed by a network node (140) in a communications network (100), for handling media between participants in a media session, the method comprising:

- obtaining (601) a request to start an in-call service; and

- sending (602), to an Internet Protocol Multimedia Subsystem, IMS, node (150), a request to create individual media channels for the media session, at least one individual media channel for each participant.

12. The method according to claim 9, further comprising:

- performing (604) the requested in-call service; and

- providing (605) the requested in-call service to the participants in the

media session.

13. The method according to any of the claims 9-10, further comprising:

- terminating (606) the requested in-call service, upon request from one of the participants in the media session.

14. The method according to any of the claims 9-11 , further comprising:

sending (607), to the IMS node (150), a request to remove one or more of the created individual media channels.

15. The method according to any of the claims 9-12, further comprising:

- exposing (603) to the IMS node (150), an Application Program Interface, API, for creating and removing individual media channels.

16. The method according to any of the claims 9-13, wherein the media session

comprises at least two participants and the input for the requested in-call service is given as individual audio associated with one of the at least two participants.

17. The method according to any of the claims 9-14, wherein the individual media

channels are bi-directional audio channels associated with at least one of the participants in the media session and/or an operator-controlled Digital Assistant,

DA.

18. The method according to any of the claims 9-15, wherein the in-call service is an in-call translation service.

19. The method according to any of the claims 9-16, wherein the network node (140) is a distributed node comprising a first network node (141) and a second network node (142).

20. The method according to claim 17, wherein the first network node (141) is an

operator controlled DA platform node.

21. The method according to any of the claims 17-18, wherein the second network node (142) comprises the in-call service.

22. A computer program comprising instructions, which when executed by a processor, causes the processor to perform actions according to any of the claims 9 to 21.

23. A carrier comprising the computer program of claim 7, wherein the carrier is one of an electronic signal, an optical signal, an electromagnetic signal, a magnetic signal, an electric signal, a radio signal, a microwave signal, or a computer-readable storage medium.

24. An Internet Protocol Multimedia Subsystem, IMS, node (150) in a communications network (100), configured to handle media between participants in a media session, wherein the IMS node (150) is further configured to:

obtain, from a network node (140), a request to create individual media channels for the media session;

- create individual media channels for the media session, at least one individual media channel for each participant; and

perform the media session using the created individual media channels.

25. The IMS node (150) according to claim 24, wherein the IMS node (150) is further configured to:

remove the created individual media channels.

26. The IMS node (150) according to claim 25, wherein the IMS node (150) is further configured to remove the created individual media channels upon request from one of the participants in the media session, via the network node (140).

27. The IMS node (150) according to any of the claims 24-26, wherein the IMS node (150) is further configured to:

determine a number of participants in the media session and create at least one individual media channel for each determined participant.

28. The IMS node (150) according to any of the claims 24-27, wherein the IMS node (150) is further configured to:

receive, from the network node (140), an Application Program Interface, API, invocation for creating and removing individual media channels.

29. The IMS node (150) according to any of the claims 24-28, wherein the IMS node (150) is further configured to:

negotiate the media session by indicating a number of media descriptors corresponding to a number of participants in the media session.

30. The IMS node (150) according to any of the claims 24-29, wherein the individual media channels are adapted to be bi-directional audio channels associated with one of the participants in the media session and/or a Digital Assistant, DA.

31. The IMS node (150) according to any of the claims 24-30, wherein the IMS node (150) is represented by a Virtual Multimedia Resource Function, vMRF, node.

32. A network node (140) in a communications network (100), configured to handle media between participants in a media session, wherein the network node (140) is further configured to:

- obtain a request to start an in-call service; and

send, to an Internet Protocol Multimedia Subsystem, IMS, node (150), a request to create individual media channels for the media session, at least one individual media channel for each participant.

33. The network node (140) according to claim 32, wherein the network node (140) is further configured to :

perform the requested in-call service; and

provide the requested in-call service to the participants in the media session.

34. The network node (140) according to any of the claims 32-33, wherein the network node (140) is further configured to:

terminate the requested in-call service, upon request from one of the participants in the media session.

35. The network node (140) according to any of the claims 32-34, wherein the network node (140) is further configured to:

send, to the IMS node (150), a request to remove one or more of the created individual media channels.

36. The network node (140) according to any of the claims 32-35, wherein the network node (140) is further configured to:

expose, to the IMS node (150), an Application Program Interface, API, for creating and removing individual media channels.

37. The network node (140) according to any of the claims 32-36, wherein the media session comprises at least two participants, and the input for the requested in-call service is given as individual audio associated with one of the at least two participants.

38. The network node (140) according to any of the claims 32-37, wherein the network node (140) is configured to request individual media channels that are bi directional audio channels associated with at least one of the participants in the media session and/or an operator-controlled Digital Assistant, DA.

39. The network node (140) according to any of the claims 32-38, wherein the in-call service comprises an in-call translation service.

40. The network node (140) according to any of the claims 32-39, wherein the network node (140) is represented by a distributed node comprising a first network node (141) and a second network node (142).

41. The network node (140) according to claim 40, wherein the first network node (141) is represented by an operator controlled DA platform node.

42. The network node (140) according to any of the claims 40-41 , wherein the second network node (142) is configured to comprise the in-call service.