CN113301357A - Live broadcast method and device and electronic equipment - Google Patents

Live broadcast method and device and electronic equipment Download PDF

Info

Publication number
CN113301357A
CN113301357A CN202010733464.9A CN202010733464A CN113301357A CN 113301357 A CN113301357 A CN 113301357A CN 202010733464 A CN202010733464 A CN 202010733464A CN 113301357 A CN113301357 A CN 113301357A
Authority
CN
China
Prior art keywords
stream
address
live broadcast
service
live
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010733464.9A
Other languages
Chinese (zh)
Other versions
CN113301357B (en
Inventor
赵文倩
黄非
刘彦伊
许勇
刘福
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Singapore Holdings Pte Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202010733464.9A priority Critical patent/CN113301357B/en
Priority to PCT/CN2021/107766 priority patent/WO2022022370A1/en
Publication of CN113301357A publication Critical patent/CN113301357A/en
Application granted granted Critical
Publication of CN113301357B publication Critical patent/CN113301357B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/55Rule-based translation
    • G06F40/56Natural language generation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/47815Electronic shopping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/485End-user interface for client configuration
    • H04N21/4856End-user interface for client configuration for language selection, e.g. for the menu or subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the application discloses a live broadcast method, a live broadcast device and electronic equipment, wherein the method comprises the following steps: a first service end receives a request for creating multi-language live broadcast submitted by a first client end; after the multi-language live broadcast is successfully established, obtaining a translated target live broadcast stream corresponding to at least one target language according to a source live broadcast stream acquired by the first client; after receiving a request for pulling the live stream submitted by a second client, determining a target language required by a user associated with the second client, and providing the target live stream corresponding to the target language for the second client to play. Through the embodiment of the application, the live broadcast technology can be better applied to cross-border commodity object information service and other systems.

Description

Live broadcast method and device and electronic equipment
Technical Field
The present application relates to the field of live broadcast technologies, and in particular, to a live broadcast method, an apparatus, and an electronic device.
Background
With the development of live broadcast technology, live broadcast is introduced in more and more industries, including commodity object information service systems. The merchant or seller user introduces the information of the commodity object in a live broadcast mode, and the buyer or consumer user can obtain more visual information about the commodity object through the video in live broadcast and the language description of the anchor broadcast and enjoy the experience of more real shopping brought by live broadcast; in addition, the system can interact with the anchor in the live broadcasting process, including inquiring about information about commodity objects, the anchor can solve online in real time, and the like. In summary, by introducing live technology, buyers or consumer users can be more effectively helped to make shopping decisions.
Some commodity object information service systems also provide cross-border services for users, and can provide services such as selling commodity objects for overseas buyers or consumer users. Under the condition that the commodity object is described in the traditional image-text mode, the image-text details can be translated into multi-national languages for overseas users to browse. However, if the live broadcasting technology is introduced into such a cross-border commodity object information service system, there is a certain difficulty. Since the anchor user can usually only cover one language in the live broadcast process, but the oriented buyer users in multiple countries have language barriers.
Therefore, how to better apply the live broadcast technology to a cross-border commodity object information service system and the like becomes a technical problem to be solved by technical personnel in the field.
Disclosure of Invention
The application provides a live broadcast method, a live broadcast device and electronic equipment, and the live broadcast technology can be better applied to cross-border commodity object information service and other systems.
The application provides the following scheme:
a live method, comprising:
a first service end receives a request for creating multi-language live broadcast submitted by a first client end;
after the multi-language live broadcast is successfully established, obtaining a translated target live broadcast stream corresponding to at least one target language according to a source live broadcast stream acquired by the first client;
after receiving a request for pulling the live stream submitted by a second client, determining a target language required by a user associated with the second client, and providing the target live stream corresponding to the target language for the second client to play.
A live stream processing method includes:
the second service end creates at least one program guide service according to the request submitted by the first service end; the request is submitted after the first service terminal receives a request for creating the multi-language live broadcast; the at least one director service corresponds to at least one target language;
acquiring a first address and at least one second address provided by the first service end, wherein the first address is used for storing a source live stream of the live broadcast, and the at least one second address corresponds to at least one target language;
and after the multi-language live broadcast is successfully created, starting a broadcast guide station service, wherein the broadcast guide station service is used for reading the source live broadcast stream from the first address, carrying out streaming voice recognition on the source live broadcast stream by calling a streaming voice recognition service and a translation service, converging the source live broadcast stream and the translation result to generate a translated target live broadcast stream corresponding to a target language, and storing the translated target live broadcast stream corresponding to the target language to a second address corresponding to the target language.
A live stream processing method includes:
the third server side creates a streaming voice recognition service and a translation service according to a call request of the second server side, wherein the request carries target language information, a first address and a third address, and the first address is used for storing a source live broadcast stream;
reading the source live stream from the first address, and performing voice recognition on the source live stream through the streaming voice recognition service;
and translating the voice recognition result through a translation service to obtain a translation result corresponding to the target language, and storing the translation result in the third address, so that the second server acquires the translation result from the third address and synthesizes the translation result and the source live stream into a target live stream corresponding to the target language.
A live method, comprising:
a first client receives a request for creating multi-language live broadcast;
submitting the request to a first service end, and receiving a first address returned by the first service end;
and after the live broadcast is successfully created, submitting the generated live broadcast stream to the first address so as to obtain the source live broadcast stream from the first address and obtain a translated target live broadcast stream corresponding to at least one target language for providing to a second client associated with a user with a target language requirement.
A method of acquiring a live stream, comprising:
the second client side submits a request for acquiring the live stream to the first service side;
receiving a second address provided by the first service end, wherein the second address is determined according to a target language required by a user associated with the second client end, and the second address stores a translated target live stream corresponding to the target language;
and pulling the target live stream through the second address and playing the target live stream.
A live broadcast device is applied to a first service end and comprises:
the request receiving unit is used for receiving a request for creating the multi-language live broadcast submitted by a first client;
a target live broadcast stream obtaining unit, configured to obtain, according to a source live broadcast stream acquired by the first client, a translated target live broadcast stream corresponding to at least one target language after the multilingual live broadcast is successfully created;
and the target live stream providing unit is used for determining a target language required by a user associated with the second client after receiving a request for pulling the live stream submitted by the second client, and providing the target live stream corresponding to the target language for the second client to play.
A live stream processing device is applied to a second server and comprises:
the system comprises a program guide station service establishing unit, a program guide station service establishing unit and a program guide station service establishing unit, wherein the program guide station service establishing unit is used for establishing at least one program guide station service according to a request submitted by a first service end; the request is submitted after the first service terminal receives a request for creating the multi-language live broadcast; the at least one director service corresponds to at least one target language;
an address obtaining unit, configured to obtain a first address and at least one second address provided by the first service end, where the first address is used to store a source live stream of the live broadcast, and the at least one second address corresponds to at least one target language;
and the director station service starting unit is used for starting the director station service after the multi-language live broadcast is successfully created, wherein the director station service is used for reading the source live broadcast stream from the first address, performing streaming voice recognition on the source live broadcast stream by calling streaming voice recognition service and translation service, acquiring a translation result corresponding to one target language, merging the source live broadcast stream and the translation result to generate a translated target live broadcast stream corresponding to the target language, and storing the translated target live broadcast stream to a second address corresponding to the target language.
A live stream processing device is applied to a third server and comprises:
the service creating unit is used for creating a streaming voice recognition service and a translation service according to a call request of a second service end, wherein the request carries target language information, a first address and a third address, and the first address is used for storing a source live stream;
the voice recognition unit is used for reading the source live stream from the first address and carrying out voice recognition on the source live stream through the streaming voice recognition service;
and the translation unit is used for translating the voice recognition result through translation service to obtain a translation result corresponding to the target language, and storing the translation result in the third address, so that the second server acquires the translation result from the third address and synthesizes the translation result and the source live stream into a target live stream corresponding to the target language.
A live broadcast device is applied to a first client and comprises:
the request receiving unit is used for receiving a request for creating the multi-language live broadcast;
the request submitting unit is used for submitting the request to a first service end and receiving a first address returned by the first service end;
and the stream pushing unit is used for submitting the generated live stream to the first address after the live stream is successfully created so as to obtain the source live stream from the first address and obtain a translated target live stream corresponding to at least one target language, so as to provide the translated target live stream to a second client associated with a user with a target language requirement.
A device for acquiring a live stream, applied to a second client, comprises:
the request submitting unit is used for submitting a request for acquiring the live stream to the first service terminal;
an address obtaining unit, configured to receive a second address provided by the first service end, where the second address is determined according to a target language required by a user associated with the second client, and the second address stores a translated target live stream corresponding to the target language;
and the stream pulling unit is used for pulling the target live stream through the second address and playing the target live stream.
According to the specific embodiments provided herein, the present application discloses the following technical effects:
according to the embodiment of the application, the establishment of multi-language live broadcast can be supported, the translated target live broadcast stream corresponding to at least one target language can be generated according to the source live broadcast stream, after the second client initiates a request for acquiring the live broadcast stream, the target user required by the user associated with the second client can be determined, the corresponding target live broadcast stream is provided for the second client, and the user can view the live broadcast content meeting the language requirement of the user.
In the concrete implementation, the specific multilingual live broadcast service can be provided in the commodity object information service system, and at the moment, training samples can be provided according to the historical live broadcast records in the commodity object information service system, so that the training of the translation model is realized. In addition, the translation result can be recorded in advance according to the special vocabulary in the field of commodity object information service, so that the accuracy of the translation result is improved.
Furthermore, under the condition that specific multilingual live broadcast service is provided in the commodity object information service system, the country/region to which the user belongs can be judged according to user data, including common receiving address information and the like, generated in the system by the user associated with the second client, so that the target language required by the user is automatically determined.
Of course, it is not necessary for any product to achieve all of the above-described advantages at the same time for the practice of the present application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a schematic diagram of a system architecture provided by an embodiment of the present application;
FIG. 2 is a flow chart of a first method provided by an embodiment of the present application;
FIG. 3-1 is a schematic diagram of an interaction timing sequence of a process of creating a live broadcast provided by an embodiment of the present application;
3-2 is a schematic diagram of an interaction timing sequence of a plug flow process provided by an embodiment of the present application;
3-3 are schematic diagrams of viewer user interfaces provided by embodiments of the present application;
FIG. 4 is a flow chart of a second method provided by embodiments of the present application;
FIG. 5 is a flow chart of a third method provided by embodiments of the present application;
FIG. 6 is a flow chart of a fourth method provided by embodiments of the present application;
FIG. 7 is a flow chart of a fifth method provided by embodiments of the present application;
FIG. 8 is a schematic diagram of a first apparatus provided by an embodiment of the present application;
FIG. 9 is a schematic diagram of a second apparatus provided by an embodiment of the present application;
FIG. 10 is a schematic diagram of a third apparatus provided by an embodiment of the present application;
FIG. 11 is a schematic diagram of a fourth apparatus provided by an embodiment of the present application;
FIG. 12 is a schematic diagram of a fifth apparatus provided by an embodiment of the present application;
fig. 13 is a schematic diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments that can be derived from the embodiments given herein by a person of ordinary skill in the art are intended to be within the scope of the present disclosure.
In the embodiment of the application, a cross-language live broadcast function is provided in order to apply a live broadcast technology in a cross-border commodity object information service system. When creating a live broadcast, a main broadcast user (which may be referred to as a first user in this embodiment of the present application, and correspondingly, a viewer user may be referred to as a second user) may select whether to use a cross-language live broadcast service, and if so, a server may help the user generate a target live broadcast stream corresponding to multiple target languages, and provide multiple pull stream addresses, where each pull stream address may correspond to one target language. Therefore, when the second user needs to watch the live broadcast, the server can provide the corresponding pull stream address to the client of the second user according to the target language required by the second user, so that the client can acquire the live stream of the corresponding target language from the pull stream address to play the live stream. In this manner, a live broadcast created by a first user in one language may be translated into a plurality of different target languages for viewing by a second user in a plurality of countries/regions. In this way, in the cross-border commodity object information service system, the second user can also obtain richer and more intuitive information about the commodity object by watching the live broadcast. Of course, the multi-language live broadcast method can also be used in other cross-border systems.
In the process of producing the translated target live stream corresponding to the target language, streaming voice recognition and translation are performed on the source live stream, and meanwhile, the accuracy of recognition and translation needs to be improved as much as possible. For this reason, in the embodiment of the present application, a plurality of director services may also be created through a special server (specifically, may be referred to as a second server, and a corresponding server specifically interacting with the front-end live broadcast client may be referred to as a first server), where each director service corresponds to one target language. In each program guide station service, specific streaming voice recognition service and translation service can be called to obtain translation result data stream, and then the source live stream and the translation result data stream are merged to obtain the translated live stream corresponding to the specific target language. The live stream can be stored to a pull stream address designated by the first server, so that the first server can obtain translated target live streams corresponding to a plurality of different target languages.
In a preferred embodiment, the streaming speech recognition service and the translation service may also be provided through a third server. Therefore, each server can be concentrated on realizing a certain function, and then, the aim of improving the translation accuracy is finally achieved through mutual matching of a plurality of servers.
The specific translation service can also translate the speech recognition result through a pre-established translation model. In specific implementation, the embodiment of the application can mainly provide multi-language live broadcast service in systems such as commodity object information service and the like, so that a live broadcast scene is single, and a foundation is provided for obtaining good translation accuracy. Specifically, the translation model can be trained by using the historical live broadcast record in the commodity object information service system as training data, so that the translation model becomes a special model in the commodity object information service field. In addition, some proper nouns and the like in the commodity object information service scene can be recorded in advance, for example, expression modes of the proper nouns under various target languages are obtained in advance, and the like. Therefore, the accuracy of translation can be further improved through the special translation model and the pre-recorded information of the proper nouns.
Furthermore, since the embodiment of the application mainly provides a multi-language live broadcast service in a system such as a commodity object information service, a target language required by a second user can be automatically identified based on data (for example, a receiving address commonly used by the user) generated by the second user in the system, so as to recommend or directly push a pull stream address corresponding to the target language to the user.
In specific implementation, as shown in fig. 1, the embodiment of the present application may relate to a client and a server provided by a system such as a commodity object information service, where the server may correspond to the first server, and the client may be divided into a first client facing a host user and a second client facing a viewer user. In addition, as mentioned above, in the specific implementation, the second server and even the third server may be involved. In a specific implementation manner, after a first client initiates a request for creating a multi-language live broadcast to a first server, the first server may call an interface of a second server to create a plurality of director services, which correspond to a plurality of different target languages respectively. A first address and a plurality of second addresses may be generated simultaneously. After the live broadcast is successfully created, the specific director service may call the streaming voice recognition and translation service of the third service end, and the obtained translation result may be stored in the third address specified by the director service. The director service can read the source live broadcast stream from the first address, read the translation result data stream from the third address, combine the streams into a target live broadcast stream corresponding to the target language, and store the target live broadcast stream to the second address specified by the first service terminal. And then, after submitting a request for acquiring the live stream, the second client can provide a corresponding second address to the second client according to the target language required by the specific user, so that the second client can pull the target live stream corresponding to the target language required by the user from the second address to play. Wherein the target language required by the user associated with different second clients is different, and therefore, the second address provided to different second clients may also be different. For example, assuming that the anchor user is a user in china, and the source language in the source live stream is chinese, after translation, target live streams corresponding to multiple target languages, such as english, french, german, japanese, and the like, are obtained and stored in different second addresses, respectively. Then, when a user A in a certain English country requests to watch the live broadcast, a second address A in which a target live broadcast stream corresponding to English is stored can be provided for the user A; when a user B in a certain french language country requests to view the live broadcast, a second address B in which a target live broadcast stream corresponding to french is stored may be provided to the user B, and so on.
The following describes in detail specific implementations provided in embodiments of the present application.
Example one
First, in the first embodiment, from the perspective of the first service end, a live broadcasting method is provided, and referring to fig. 2, the method may specifically include:
s201: a first service end receives a request for creating multi-language live broadcast submitted by a first client end;
in specific implementation, an operation option for creating live broadcast can be provided in a first client associated with a first user such as a main broadcast, when the first user clicks to create live broadcast, the first user can inquire whether the user needs to create multi-language live broadcast, and if the user selects the need, a request for creating multi-language live broadcast can be sent to the first server. Or, in another mode, different operation options for creating a normal live broadcast and a multi-language live broadcast respectively may be provided in the first client, and a user needing to create the multi-language live broadcast may directly initiate a specific request through the operation options.
In specific implementation, when a user needs to create a multi-language live broadcast request, an operation option for submitting source language information used by live broadcast can be provided through the first client. For example, if a user in China is using Chinese when live, the source language may be selected as "Chinese", and so on. In addition, in an alternative manner, the first client may also provide operational options for selecting the target language, i.e. which target language the first user decides to translate into. In the case where the user does not select, the target language may be determined in a default configuration. The information of the default configuration may be configuration information common to multiple users, or may be configured according to personalized information of the user associated with the first client, for example, the default configuration may be performed according to a history selection record, and the like. The target language may be one or more, that is, the source live stream may be translated into a target live stream corresponding to a plurality of different target languages, so that users in different countries/regions can understand live content.
S202: after the multi-language live broadcast is successfully established, obtaining a translated target live broadcast stream corresponding to at least one target language according to a source live broadcast stream acquired by the first client;
after receiving the request for creating the multi-language live broadcast, the first server can enter a process of creating the live broadcast. After the creation is completed, the translated target live stream corresponding to at least one target language can be obtained according to the source live stream, so as to provide the translated target live stream for second clients with various language requirements.
In a specific implementation, as described above, the processing such as voice recognition and translation of the source live stream may be implemented by invoking the director service of the second server. The second service end may be a service end of a cloud service platform having an association relationship with the first service end, or may exist in other forms. In this case, after receiving the request for creating the multilingual live broadcast, the first service end may first generate a first address and at least one second address, where the at least one second address corresponds to at least one target language. Specifically, the first address and the second address may be addresses applied in an associated Content Delivery Network (CDN). After the first address and the second address are generated, the first address may be provided to the first client, so that, after the multi-language live broadcast is successfully created, the first client may store the generated source live broadcast stream to the first address (that is, the first client may push a stream to the first address). In addition, the first address and at least one second address may also be provided to a second server, so that the second server may obtain the source live stream from the first address, and after obtaining a translated target live stream corresponding to at least one target language, store the target live stream to the second address, respectively. In this way, the first server may obtain the translated target live stream corresponding to the plurality of different target languages respectively stored to the different second addresses. Subsequently, when the second client requests to watch the live broadcast, a second address corresponding to a target language required by a user associated with the second client may be returned to the second client, so that the second client can obtain a translated target live broadcast stream corresponding to the target language from the second address to play the target live broadcast stream.
In the case of obtaining the target live stream in the foregoing manner, the first service end may also call an interface of the second service end first before generating the specific second address, so as to implement creation of the director service. For example, a specific timing diagram may be as shown in fig. 3-1, where a first service end may first send a "CreateCaster" request to a second service end, and specifically may request to create a plurality of director services; after the second service end completes the creation of the director service, the CasterId can be returned to the first service end; thereafter, the first server may configure the Caster (setcastterconfig); and applies for adding a director source (addmastervideoresource) to the second server, after which a director channel (setcastechannel) may be set, a director layout (addmasterlayout) added, a director component (addmastercomponent) added, etc. After the interaction is completed, the first server may generate a plurality of second addresses, that is, pull addresses, where the plurality of second addresses correspond to the plurality of target languages, respectively.
After generating the second address, the creation of the multi-language live broadcast may be completed, and the first address may be provided to the first client, after which the process of hosting the push stream may be entered. For example, as shown in fig. 3-2, the first client may push the captured source live stream to the first address for saving. Meanwhile, the first server may also call an interface provided by the second server, and start a director service (startcast) that has been created before the second server starts, where the at least one director service corresponds to the at least one target language, respectively. In addition, parameters such as the first address and the second address may be provided to a specific director service by updating director configuration information (updatecastterscene config) or the like. And then, the director service can perform streaming voice recognition on the source live stream by calling a streaming voice recognition service and a translation service to obtain a translation result, and then generates the translated target live stream by converging the source live stream and the translation result.
In specific implementation, the director service may further perform streaming voice recognition on the source live stream and obtain a translation result in a manner of calling a streaming voice recognition service and a translation service provided by a third service end. In this case, the director service may also first apply for a third address, which is carried in the request for invocation of the streaming speech recognition and translation service, in order to save the translation result (translated text or speech) to the third address. In this way, the director service may read the translation result from the third address, and then merge the source live stream of the first address with the translation result of the third address to generate a translated target live stream, and store the target live stream to the corresponding second address.
Wherein, the third server can be a basic service focused on providing big data processing and the like. The specific translation service may be to translate the speech recognition result according to a pre-established translation model. In specific implementation, the multilingual live broadcast in the embodiment of the application can be a live broadcast established in a commodity object information service system; at this time, the specific translation model is obtained by training using the historical live broadcast record in the commodity object information service system as training data. That is to say, the historical live broadcast records in the commodity object information service system can be provided to the third server, and the data can be used as training samples to train the translation model, so that the specific translation model becomes a special model in the commodity object information service field, and the accuracy of the translation result in the field is improved.
In addition, in a preferred embodiment, translation information of a special vocabulary related to the introduction of the commodity object may be stored in advance, and the speech recognition result may be translated based on the information, thereby further improving the accuracy of translation. That is, during the live broadcast of the commodity object information service field, it is possible that the host user may often use some proprietary words, which may have multiple translation ways if the field factors are not considered, so that the translation may be inaccurate. In the embodiment of the application, because the live broadcast can be determined in the commodity object information service field, the special vocabulary can be translated in advance by combining the information in the field to obtain translation results in a plurality of different target languages. Particularly, when the direct stream is translated, if the professional vocabulary is encountered, the translation can be carried out by utilizing the pre-recorded result, so that the translation accuracy is improved.
That is, because the embodiment of the present application can provide the multilingual live broadcast service in a specific field, the unicity of this field makes it possible to obtain an accurate multilingual translation result, that is, the translation result has high readability, rather than merely performing mechanical translation, thereby providing an effective multilingual live broadcast service.
Moreover, since the anchor user usually introduces information such as merchandise objects in spoken language during the live broadcast, there may be cases where the expression grammar is inaccurate or relatively arbitrary. For example, "i try on first", but the correct expression syntax would be "i try on first", etc. And in the case of inaccurate grammar, the accuracy of the translation result may be affected. Therefore, in order to further improve the quality of the translation result, the translation service may further adjust the sentence structure of the speech recognition result before translating the speech recognition result, for example, adjusting the sentence components such as the main predicate shape of the sentence, so as to make the sentence structure more standard. It should be noted that, in the case of such adjusting the sentence structure, the real-time performance of the translation result may be slightly affected, but in practical applications, because the requirement of the viewer user on the real-time performance is usually not high in the commercial object information service scenario, and the interaction between the viewer user and the anchor user is usually not affected, the effect on the real-time performance may be ignored.
In addition, since the source languages used by the anchor user may be different in different source live streams, the present embodiment may involve translation from multiple languages to multiple languages. In order to facilitate translation, the first service end can also determine source language information associated with the live broadcast according to information carried in the request for creating the multi-language live broadcast, and provide the source language information to the second service end. Of course, in the specific implementation, the source language may also be determined by the second server or the specific translation service according to the speech recognition result in the source live stream.
In specific implementation, the specifically generated translated target live stream may include: the live stream associated with the subtitle corresponding to the target language may also include a live stream associated with the speech corresponding to the target language. That is, the voice in the source live stream can be directly converted into text and translated into text in the target language, and then the text can be added to the image of the source live stream in the form of subtitles, so that the viewer user can know what the anchor user said by viewing the subtitles. Or, in another case, after the text translation is completed, speech synthesis may be performed, and then the speech stream in the source live stream may be replaced by the translated speech stream to generate the target live stream. Therefore, the viewer user can directly listen to the voice information corresponding to the target language in the live broadcast watching process.
Under the condition of providing translation information in the form of subtitles, the first server can also provide subtitle display related parameter information including subtitle layout parameters, position, height, size of a subtitle frame, background colors, word number upper limit, subtitle fonts, size, appearance duration and the like to the second server. In this way, after the second server acquires the translated subtitle stream corresponding to the target language, the subtitle is added to the source live stream according to the parameter information related to subtitle display so as to generate the corresponding target live stream.
Specifically, a specific implementation scenario of the embodiment of the present application may be a multi-language live broadcast in a commodity object information service system, and a anchor user in such a system is usually a merchant or a seller of a seller, and usually has relatively professional knowledge only in terms of introducing commodities, but may be professional in terms of live broadcast technology; in addition, the live broadcast device used by the anchor user is usually a mobile terminal device such as a mobile phone, and the device itself is not professional enough, so that the quality of a specific live broadcast picture is uneven. For example, the sharpness of a live view may be different due to the different resolutions of the devices used by different anchor; in addition, the main broadcast may select the space at random when the main broadcast is started, so that the background of some live broadcast pictures may be messy, and the like. However, because the subtitle information provided in the embodiment of the present application needs to be added to the live broadcast picture, the addition effect of the subtitles may be affected by the presence of the above factors. For example, for a device with a relatively low resolution, if the subtitle font is relatively small, the subtitle may be displayed unclear and is inconvenient to read; for the condition that the background of the live broadcast picture is disordered, if the caption background is transparent, the problem that part of the caption is unclear can occur, but if the caption background is uniformly set to be non-transparent, the condition that the background of the live broadcast picture is simple is realized, the shielding of the live broadcast picture by the caption with the non-transparent background color is unnecessary, and the like.
Therefore, in the embodiment of the present application, the subtitle display related parameter may be determined according to the actual situation of the specific first client. For example, the resolution of the terminal device associated with the first client and/or the screen direction (vertical screen or horizontal screen) information required by the live broadcast process may be obtained, and specific parameters related to the subtitle display may be determined according to the information. The resolution information may be obtained by the first client for the local relevant screen parameters of the terminal device, or an operation option for entering the screen parameters may be provided at the first client for the first user to enter. The information about the screen direction may be entered by the first user, or, in a specific implementation, live scene information associated with the multi-language live broadcast may be acquired before the live broadcast starts, and advice information about the screen direction may be provided to the first client according to the live scene information. For example, if a specific live broadcast scene is to introduce a clothing commodity object, including displaying the upper body effect of the clothing, and the like, at this time, the user may be suggested to perform live broadcast in a vertical screen mode, and the like. After the information is determined, the height and size of the subtitle frame, the size of the subtitle font, and the like can be determined according to the specific resolution parameters. In addition, the position of the subtitle box can be determined according to the screen direction information, for example, if the subtitle box is a vertical screen, the subtitle box can be positioned above the comment area, and if the subtitle box is a horizontal screen, the subtitle box can be positioned on the right side of the comment area, so that the situation that the subtitle characters and the characters in the comment area are shielded from each other and the like is avoided, and the like.
In addition, besides information such as screen resolution, screen direction, and the like, live broadcast picture background image information may also be acquired, for example, an image of a live broadcast site may be started to be acquired before the live broadcast is started, so as to acquire a live broadcast picture background image, and the like. Through the live broadcast picture background image information, the dominant hue of the background image can be determined, or the chaos degree of the background image can be determined, so that whether the caption background adopts transparent color or not can be determined, and under the non-transparent condition, the caption background color can be determined according to the dominant hue of the live broadcast picture background image, for example, the color with larger color difference with the dominant hue of the live broadcast picture background image can be specifically determined, so that the identification degree of the caption can be improved.
After the various parameters in the aspect of caption display are specifically determined, the parameters can be provided to the second server, so that after the translation in various target languages is acquired, the translation can be added to the caption of the source live stream according to the parameter information, thereby generating target live streams corresponding to various different target languages, and the target live streams can be respectively stored to the second address specified in advance by the first server.
S203: after receiving a request for pulling the live stream submitted by a second client, determining a target language required by a user associated with the second client, and providing the target live stream corresponding to the target language for the second client to play.
After the translated target live streams corresponding to the multiple different target languages are obtained, a specific target live stream can be provided to the second client. Specifically, after receiving a request for pulling a live stream submitted by a second client, a target language required by a user associated with the second client is determined, and the target live stream corresponding to the target language is provided to the second client for playing. It should be noted that, in the specific implementation, an operation option for turning on or turning off the multi-language live translation function may be provided in the second client. Therefore, when a user sends a specific request for watching the live broadcast, the switch state can be judged firstly, and if the live broadcast translation function is in the open state, the request for acquiring the translated target live broadcast stream can be submitted to the first service terminal. Otherwise, if the live broadcast translation function is in a closed state, a request for acquiring a source live broadcast stream can be submitted to the first service end, so that the source live broadcast stream can be played.
The method for specifically determining the target language required by the user associated with the second client may be various, for example, in one method, the information of the required target language may be submitted when the user initiates a request for acquiring a live stream through the second client. Or, in another mode, the multi-language live broadcast in the embodiment of the present application may specifically include a live broadcast created in a commodity object information service system, and therefore, a target language required by a user associated with the second client may also be determined according to data generated by the user associated with the second client in the commodity object information service system.
For example, specifically, the country/region where the user associated with the second client is located may be determined according to the shipping address information corresponding to the user associated with the second client; then, the user associated with the second client is determined according to the country/region. Alternatively, the country/region where the user is located may also be determined based on location information associated with the user, and so on.
Certainly, in specific implementation, in the case that the target language required by the user is automatically determined in the above manner, a determination error may occur, and therefore, an operation option for switching other target languages may be provided in the second client, so that the user may switch to the target live stream corresponding to other target languages for playing.
After the target language required by the user is determined, the target live stream corresponding to the target language can be provided for the user. Accordingly, the second client can play the target live stream. In this way, when users in different countries/regions watch the same live broadcast, the live broadcast content which meets the required target language can be obtained. For example, in the case of providing a target live stream by means of subtitles, as shown in fig. 3-3, an interface seen by a user in english country/region may express what the anchor user currently says, for example, "sensitive bauty tips for enhancing your appearance" by english subtitles as shown in (a). The interface viewed by the french country/region user can be expressed by french subtitles, as shown in (B), for example, "Un bon browsing pool liorrer votre conference" that the anchor user currently speaks. As shown in (C), the interface viewed by users in japanese countries and regions can express what the host user currently speaks by japanese subtitles, for example, "あなた vertically slice を to strengthen the position of するため, bright, な, and ヒント at しさ".
In addition, the first service end can also count the watching conditions of the users in the country/region associated with the at least one target language on the multi-language live broadcast respectively according to the access conditions of the client to the second addresses, and provide the counting result to the first client. For example, the number of viewers in english country/region, the number of viewers in french country/region, the number of viewers in japanese country/region, etc. of a live broadcast may be counted. The data can be provided for the first client side through forms such as a data billboard, so that the anchor user can visually determine information such as popularity of the specific live broadcast in countries/regions of different languages, and further can help the user to adjust marketing strategies and the like. For example, if the number of viewers in a live broadcast is significantly higher in english language countries/regions than in other countries/regions, the layout of marketing strategies may be focused on english language countries/regions, and so on. Alternatively, such data billboard information may also help the user adjust subsequent live strategies, and so on.
In addition, in practical application, the viewer user can also share the live broadcast address with other users, so that other users can also watch specific live broadcasts. In the embodiment of the application, sharing from users in other countries/regions by viewer users can be supported. If sharing is performed among users of different languages, after receiving a request of a second client for sharing the live broadcast, the first server can also determine a target language required by the shared target user, and returns an address of a target live broadcast stream corresponding to the target language to the first client, so that the first client can copy the address and then share the address to the target user, and the address is conveniently played in a client associated with the target user. For example, a certain user a shares a certain live broadcast with a user B, and in a conventional manner, the user a may directly copy an address where the user a watches the live broadcast to the user B, but in this embodiment of the present application, if the target language required by the user B is different from that required by the user a, the user B cannot directly obtain effective live broadcast content from the address copied by the user a. Therefore, in the embodiment of the application, a sharing operation option can be provided in the first client, and when a user needs to share with other users, the user can initiate a sharing request through the sharing operation option and can carry required target language information. After receiving the request, the first server can perform address conversion to convert the address into an address where a target live stream corresponding to the target language required by the user B is located, and then returns the address to the user A, and the user A provides the converted address to the user B, so that the user B can view the live content conforming to the target language of the user B.
In order to better understand the specific technical solution provided by the embodiment of the present application, an alternative implementation scheme provided by the embodiment of the present application is described below with reference to an example when the specific implementation scheme is implemented in a commodity object information service system.
If a certain merchant user needs to introduce commodities to consumer users in multiple countries in a live broadcast mode, a multi-language live broadcast request can be sent out through a first client end associated with the merchant user; the source language used by the user, the required target language and the like can be selected while the request is sent, so that the information can be carried in the request, of course, the source language can be automatically identified by the voice recognition service without selection, the translation is carried out according to the target language configured by default, and the like. In addition, the screen parameters of the terminal device associated with the first client, the screen direction required by live broadcast, the background image information of live broadcast pictures and the like can be carried to the first service end through the request.
After receiving the request for creating the multi-language live broadcast, the first service end can request the second service end to create a plurality of broadcasting guide platform services, which correspond to a plurality of target languages respectively. In the process of creating the director service, some parameters may also be configured, and specifically, the parameters may include parameters in the aspect of subtitle presentation and the like. The parameters can be determined according to screen parameters of equipment associated with the first client, the screen direction, the main tone of a background image of a live broadcast picture, the color disorder degree and the like, so as to meet the subtitle display requirements generated by the inexpertness of the equipment and the main broadcast in the live broadcast process in a commodity object information service scene.
After the interaction with the second server is completed, a first address can be applied to a related content distribution system and the like as a push stream address, and a plurality of second addresses are used as pull stream addresses, so that the establishment of multi-language live broadcast is completed, and the first address is provided for the first client.
After receiving the information of the completion of the live broadcast creation, the first client may store the acquired source live broadcast stream to the first address, and at the same time, the first server may initiate a request to the second server to start the previously created director service and provide the information of the first address and the second address to the second server. Correspondingly, the director station service can read the source live stream from the first address, and obtain a voice recognition result and a translation corresponding to the target language by calling the voice recognition service and the translation service of the third service terminal. When language recognition and translation are carried out, model training can be carried out in advance based on historical live broadcast records in the commodity object information service system, and therefore the accuracy of translated texts is improved. In addition, some proper nouns in the field can be recorded in advance, so that the accuracy of the translation is further improved.
After obtaining the translation of the language identification result in the live stream, the second server may add the translation to the image of the source live stream according to the caption display related parameter configured before the first server, so as to generate a target live stream in the corresponding target language, and store the target live stream in the corresponding second address.
The above processes can be respectively completed by a plurality of director services, so that in the live broadcasting process, target live broadcasting streams corresponding to various different target languages can be respectively generated at a plurality of second addresses, and the target live broadcasting streams are respectively provided with subtitles of respective target languages.
When a consumer user in a certain overseas country/region needs to watch the live broadcast, a request for pulling the live broadcast stream can be initiated to the first service end through the second client. At this time, the first server may determine a target language that may be required by the user according to information such as a common shipping address of the user associated with the second client in the commodity object information service system, and then provide the second address corresponding to the target language to the second client. The second client can pull and play the target live stream from the second address, so that the user can understand the content in the live broadcast through the caption. Meanwhile, operation options for selecting more target languages can be provided at the second client side, so that the user can switch to other target languages to watch live content.
In a word, according to the embodiment of the application, the establishment of multi-language live broadcast can be supported, the translated target live broadcast stream corresponding to at least one target language can be generated according to the source live broadcast stream, after the second client initiates a request for acquiring the live broadcast stream, the target user required by the user associated with the second client can be determined, and the corresponding target live broadcast stream is provided for the second client, so that the user can view the live broadcast content meeting the language requirement of the user.
In the concrete implementation, the specific multilingual live broadcast service can be provided in the commodity object information service system, and at the moment, training samples can be provided according to the historical live broadcast records in the commodity object information service system, so that the training of the translation model is realized. In addition, the translation result can be recorded in advance according to the special vocabulary in the field of commodity object information service, so that the accuracy of the translation result is improved.
Furthermore, under the condition that specific multilingual live broadcast service is provided in the commodity object information service system, the country/region to which the user belongs can be judged according to user data, including common receiving address information and the like, generated in the system by the user associated with the second client, so that the target language required by the user is automatically determined.
Example two
The second embodiment corresponds to the first embodiment, and provides a live stream processing method from the perspective of the second server, with reference to fig. 4, where the method may specifically include:
s401: the second service end creates at least one program guide service according to the request submitted by the first service end; the request is submitted after the first service terminal receives a request for creating the multi-language live broadcast; the at least one director service corresponds to at least one target language;
s402: acquiring a first address and at least one second address provided by the first service end, wherein the first address is used for storing a source live stream of the live broadcast, and the at least one second address corresponds to at least one target language;
s403: and after the multi-language live broadcast is successfully created, starting a broadcast guide station service, wherein the broadcast guide station service is used for reading the source live broadcast stream from the first address, carrying out streaming voice recognition on the source live broadcast stream by calling a streaming voice recognition service and a translation service, converging the source live broadcast stream and the translation result to generate a translated target live broadcast stream corresponding to a target language, and storing the translated target live broadcast stream corresponding to the target language to a second address corresponding to the target language.
In specific implementation, the director service may be specifically configured to invoke a streaming voice recognition service and a translation service provided by a third server, generate a third address, and provide the first address and the third address to the third server, so that the third server obtains a translation result and stores the translation result in the third address; and the director station service reads the translation result through the third address, synthesizes the translation result with the source live stream and generates the target live stream.
Wherein the translation result comprises a translated text stream; at this time, the director service is specifically configured to add a text stream as subtitle information of the source live stream to generate a corresponding target live stream.
Or the translation result comprises a translated voice stream; at this time, the director station service is specifically configured to delete a voice stream from the source live stream, and synthesize the voice stream with the translated voice stream to generate the target live stream.
EXAMPLE III
The third embodiment is also corresponding to the first embodiment, and from the perspective of the third server, a live stream processing method is provided, and referring to fig. 5, the method may specifically include:
s501: the third server side creates a streaming voice recognition service and a translation service according to a call request of the second server side, wherein the request carries target language information, a first address and a third address, and the first address is used for storing a source live broadcast stream;
s502: reading the source live stream from the first address, and performing voice recognition on the source live stream through the streaming voice recognition service;
s503: and translating the voice recognition result through a translation service to obtain a translation result corresponding to the target language, and storing the translation result in the third address, so that the second server acquires the translation result from the third address and synthesizes the translation result and the source live stream into a target live stream corresponding to the target language.
Wherein the live broadcast comprises a live broadcast created in a commodity object information service system; at this time, the translation service may specifically be configured to translate the speech recognition result according to a pre-established translation model, where the translation model is obtained by training using a historical live broadcast record in the commodity object information service system as training data.
In addition, the translation service may also translate the speech recognition result according to translation information of a special vocabulary related to the introduction of the commodity object, which is stored in advance.
Example four
Fourth, the embodiment provides a live broadcast method from the perspective of the first client associated with the anchor user, and referring to fig. 6, the method may specifically include:
s601: a first client receives a request for creating multi-language live broadcast;
s602: submitting the request to a first service end, and receiving a first address returned by the first service end;
s603: and after the live broadcast is successfully created, submitting the generated live broadcast stream to the first address so as to obtain the source live broadcast stream from the first address and obtain a translated target live broadcast stream corresponding to at least one target language for providing to a second client associated with a user with a target language requirement.
In specific implementation, an operation option for selecting a source language associated with source live broadcast can be provided; and submitting the source language information received through the operation options to the first server.
In addition, statistical information provided by the first service end may also be received, where the statistical information includes: and the users of the country/region associated with the at least one target language respectively watch the multi-language live broadcast, and display the statistical information.
EXAMPLE five
In a fifth embodiment, from the perspective of the second client associated with the viewer user, a method for acquiring a live stream is provided, and referring to fig. 7, the method may specifically include:
s701: the second client side submits a request for acquiring the live stream to the first service side;
s702: receiving a second address provided by the first service end, wherein the second address is determined according to a target language required by a user associated with the second client end, and the second address stores a translated target live stream corresponding to the target language;
s703: and pulling the target live stream through the second address and playing the target live stream.
In specific implementation, an operation option for reselecting the target language can be provided; and submitting the target language reselected through the operation options to the first service end so that the first service end provides a second address corresponding to the reselected target language.
In addition, an operation option for turning on or turning off the multi-language live broadcast translation function can be provided; specifically, when a request for acquiring a live stream is submitted to a first service end, if the live translation function is in an open state, the request for acquiring a translated target live stream is submitted to the first service end. Otherwise, if the live broadcast translation function is in a closed state, submitting a request for acquiring a source live broadcast stream to the first service end so as to play the source live broadcast stream.
In addition, an operation option for sharing the live broadcast can be provided; after receiving a sharing request through the operation options, determining a target language required by a sharing object, and submitting the sharing request and the target language required by the sharing object to the first service end; and after receiving a second address corresponding to the target language required by the shared object and returned by the first service end, providing the second address to the client associated with the shared object.
For the parts of the second to fifth embodiments that are not described in detail, reference may be made to the description of the first embodiment, which is not repeated herein.
It should be noted that, in the embodiments of the present application, the user data may be used, and in practical applications, the user-specific personal data may be used in the scheme described herein within the scope permitted by the applicable law, under the condition of meeting the requirements of the applicable law and regulations in the country (for example, the user explicitly agrees, the user is informed, etc.).
Corresponding to the first embodiment, an embodiment of the present application further provides a live broadcast apparatus, where the apparatus is applied to a first service end, and referring to fig. 8, the apparatus may specifically include:
a request receiving unit 801, configured to receive a request for creating a multilingual live broadcast, which is submitted by a first client;
a target live broadcast stream obtaining unit 802, configured to obtain, according to a source live broadcast stream acquired by the first client, a translated target live broadcast stream corresponding to at least one target language after the multilingual live broadcast is successfully created;
the target live stream providing unit 803 is configured to determine a target language required by a user associated with the second client after receiving a request for pulling a live stream submitted by the second client, and provide the target live stream corresponding to the target language to the second client for playing.
Specifically, the target live stream obtaining unit may include:
an address generating unit, configured to generate a first address and at least one second address, where the at least one second address corresponds to at least one target language;
a first address providing unit, configured to provide the first address to the first client, so that after the multi-language live broadcast is successfully created, the first client stores the generated source live broadcast stream to the first address;
a second address providing unit, configured to provide the first address and at least one second address to a second server, so that the second server obtains the source live stream from the first address, and respectively stores the source live stream and the translated target live stream to the second address after obtaining the translated target live stream corresponding to at least one target language;
the target live stream providing unit may specifically be configured to:
and returning a second address corresponding to the target language to the second client, so that the second client can obtain the translated target live stream corresponding to the target language from the second address to play.
The second address providing unit may be specifically configured to:
starting at least one program director service in a second server by calling a service creation interface provided by the second server, wherein the at least one program director service corresponds to the at least one target language respectively; the director station service carries out streaming voice recognition on the source live broadcast stream by calling streaming voice recognition service and translation service, and generates a translated target live broadcast stream by converging the source live broadcast stream and the translation result after obtaining a translation result; the request for calling the director service carries information of the first address and the second address.
Specifically, the director station service may perform streaming voice recognition on the source live stream and acquire a translation result by calling a streaming voice recognition service and a translation service provided by a third server; the director service carries a third address in the call request so as to store the translation result to the third address, and merges the source live stream of the first address and the translation result of the third address to generate a translated target live stream which is stored to the second address.
Wherein the multi-language live broadcast comprises a live broadcast established in a commodity object information service system; at this time, the translation service translates the voice recognition result according to a pre-established translation model, and the translation model is obtained by training with historical live broadcast records in the commodity object information service system as training data.
In addition, the translation service may also translate the speech recognition result according to translation information of a special vocabulary related to the introduction of the commodity object, which is stored in advance.
Moreover, the translation service may also adjust a sentence structure of the speech recognition result before translating the speech recognition result.
In a specific implementation, the apparatus may further include:
a source language information determining unit, configured to determine source language information associated with the live broadcast according to information carried in the request for creating a multilingual live broadcast;
and the source language information providing unit is used for providing the source language information to the second server.
Wherein the translated target live stream comprises: the live stream is associated with the subtitle corresponding to the target language; at this time, the apparatus may further include:
and the layout parameter information providing unit is used for providing page layout parameter information to the second server, so that the director service adds the text stream as the subtitle information of the source live stream according to the page layout parameter after acquiring the translated text stream corresponding to the target language, so as to generate a corresponding target live stream.
In addition, the apparatus may further include:
and the counting unit is used for counting the watching conditions of the multi-language live broadcast by the users in the country/region associated with the at least one target language according to the access condition of the second address, and providing a counting result for the first client.
Wherein the translated target live stream comprises: and associating the live stream with the subtitle corresponding to the target language, or associating the live stream with the voice corresponding to the target language.
Specifically, the multi-language live broadcast comprises a live broadcast established in a commodity object information service system; at this time, the target live stream providing unit may be configured to:
and determining a target language required by the user associated with the second client according to the data generated by the user associated with the second client in the commodity object information service system.
Specifically, the target live stream providing unit may be configured to:
determining the country/region where the user associated with the second client is located according to the receiving address information corresponding to the user associated with the second client; and determining the user associated with the second client according to the country/region.
In addition, the apparatus may further include:
and the sharing unit is used for determining a target language required by a shared target user when receiving a request of a second client for sharing the live broadcast, and providing address information of a target live broadcast stream corresponding to the target language to the second client so that the second client can share the address to the target user.
Corresponding to the second embodiment, an embodiment of the present application further provides a live stream processing apparatus, where the apparatus is applied to a second server, and referring to fig. 9, the apparatus may specifically include:
a director service creating unit 901, configured to create at least one director service according to a request submitted by a first service end; the request is submitted after the first service terminal receives a request for creating the multi-language live broadcast; the at least one director service corresponds to at least one target language;
an address obtaining unit 902, configured to obtain a first address and at least one second address provided by the first service end, where the first address is used to store the live source live stream, and the at least one second address corresponds to at least one target language;
a director service starting unit 903, configured to start the director service after the multi-language live broadcast is successfully created, where the director service is configured to read the source live broadcast stream from the first address, perform streaming voice recognition on the source live broadcast stream by calling a streaming voice recognition service and a translation service, obtain a translation result corresponding to one of the target languages, merge the source live broadcast stream and the translation result to generate a translated target live broadcast stream corresponding to the target language, and store the translated target live broadcast stream to a second address corresponding to the target language.
The director station service is specifically configured to invoke a streaming voice recognition service and a translation service provided by a third server, generate a third address, and provide the first address and the third address to the third server, so that the third server obtains a translation result and stores the translation result in the third address; and the director station service reads the translation result through the third address, synthesizes the translation result with the source live stream and generates the target live stream.
Wherein the translation result comprises a translated text stream;
the director station service is specifically configured to add a text stream as subtitle information of the source live stream to generate a corresponding target live stream.
Or the translation result comprises a translated voice stream;
the director station service is specifically configured to delete a voice stream from the source live stream, and synthesize the voice stream with the translated voice stream to generate the target live stream.
Corresponding to the embodiment, the embodiment of the present application further provides a live stream processing apparatus, referring to fig. 10, the apparatus is applied to a third server, and includes:
a service creating unit 1001, configured to create a streaming voice recognition service and a translation service according to a call request of a second server, where the request carries target language information, a first address and a third address, and the first address is used to store a source live stream;
a voice recognition unit 1002, configured to read the source live stream from the first address, and perform voice recognition on the source live stream through the streaming voice recognition service;
the translation unit 1003 is configured to translate a voice recognition result through a translation service to obtain a translation result corresponding to the target language, and store the translation result in the third address, so that the second server obtains the translation result from the third address and synthesizes the translation result and the source live stream into a target live stream corresponding to the target language.
Wherein the live broadcast comprises a live broadcast created in a commodity object information service system;
the translation service translates the voice recognition result according to a pre-established translation model, and the translation model is obtained by training with historical live broadcast records in the commodity object information service system as training data.
In addition, the translation service translates the voice recognition result according to translation information of a special vocabulary related to commodity object introduction, which is stored in advance.
Corresponding to the fourth embodiment, an embodiment of the present application further provides a live broadcast apparatus, which is applied to a first client, and referring to fig. 11, the apparatus may specifically include:
a request receiving unit 1101 configured to receive a request for creating a multilingual live broadcast;
a request submitting unit 1102, configured to submit the request to a first service end, and receive a first address returned by the first service end;
a stream pushing unit 1103, configured to submit the generated live stream to the first address after the live stream creation is successful, so as to obtain the source live stream from the first address, and obtain a translated target live stream corresponding to at least one target language, so as to provide the translated target live stream to a second client associated with a user with a target language requirement.
In a specific implementation, the apparatus may further include:
the operation option providing unit is used for providing an operation option for selecting a source language associated with the source live broadcast;
and the source language information submitting unit is used for submitting the source language information received by the operation option to the first server.
In addition, the apparatus may further include:
a statistical information receiving unit, configured to receive statistical information provided by the first service end, where the statistical information includes: the users of the country/region associated with the at least one target language respectively watch the multi-language live broadcast;
and the statistical information display unit is used for displaying the statistical information.
Corresponding to the fifth embodiment, an apparatus for acquiring a live stream is further provided in the embodiment of the present application, with reference to fig. 12, where the apparatus is applied to a second client, and includes:
a request submitting unit 1201, configured to submit a request for acquiring a live stream to a first service end;
an address obtaining unit 1202, configured to receive a second address provided by the first service end, where the second address is determined according to a target language required by a user associated with the second client, and the second address stores a translated target live stream corresponding to the target language;
a stream pulling unit 1203, configured to pull the target live stream by using the second address and play the target live stream.
In a specific implementation, the apparatus may further include:
a first operation option providing unit for providing an operation option for reselecting the target language;
and the reselection result submitting unit is used for submitting the target language reselected through the operation options to the first service end so that the first service end provides a second address corresponding to the reselected target language.
In addition, the apparatus may further include:
the second operation option providing unit is used for providing operation options for starting or closing the multi-language live broadcast translation function;
the request submitting unit may be specifically configured to:
and if the live broadcast translation function is in an open state, submitting a request for acquiring the translated target live broadcast stream to the first service terminal.
In addition, the request submitting unit may be further configured to:
and if the live broadcast translation function is in a closed state, submitting a request for acquiring a source live broadcast stream to the first service terminal so as to play the source live broadcast stream.
Furthermore, the apparatus may further include:
a third operation option providing unit, configured to provide an operation option for sharing the live broadcast;
the target language determining unit is used for determining a target language required by a sharing object after receiving the sharing request through the operation option, and submitting the sharing request and the target language required by the sharing object to the first service end;
and the sharing unit is used for receiving a second address which is returned by the first service end and corresponds to the target language required by the shared object, and then providing the second address to the client associated with the shared object.
In addition, the present application also provides a computer-readable storage medium, on which a computer program is stored, where the computer program is characterized in that, when being executed by a processor, the computer program implements the steps of the method in any one of the foregoing method embodiments.
And an electronic device comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform the steps of the method of any of the preceding method embodiments.
Where fig. 13 illustratively shows the architecture of an electronic device, for example, device 1300 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, an aircraft, or the like.
Referring to fig. 13, device 1300 may include one or more of the following components: a processing component 1302, a memory 1304, a power component 1306, a multimedia component 1308, an audio component 1310, an input/output (I/O) interface 1312, a sensor component 1314, and a communication component 1316.
The processing component 1302 generally controls overall operation of the device 1300, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing element 1302 may include one or more processors 1320 to execute instructions to perform all or part of the steps of the methods provided by the disclosed aspects. Further, the processing component 1302 can include one or more modules that facilitate interaction between the processing component 1302 and other components. For example, the processing component 1302 may include a multimedia module to facilitate interaction between the multimedia component 1308 and the processing component 1302.
The memory 1304 is configured to store various types of data to support operation at the device 1300. Examples of such data include instructions for any application or method operating on device 1300, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 1304 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The power supply component 1306 provides power to the various components of the device 1300. Power components 1306 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 1300.
The multimedia component 1308 includes a screen that provides an output interface between the device 1300 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 1308 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the back-facing camera may receive external multimedia data when the device 1300 is in an operational mode, such as a capture mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 1310 is configured to output and/or input audio signals. For example, the audio component 1310 includes a Microphone (MIC) configured to receive external audio signals when the device 1300 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 1304 or transmitted via the communication component 1316. In some embodiments, the audio component 1310 also includes a speaker for outputting audio signals.
The I/O interface 1312 provides an interface between the processing component 1302 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 1314 includes one or more sensors for providing various aspects of state assessment for the device 1300. For example, the sensor assembly 1314 may detect the open/closed state of the device 1300, the relative positioning of components, such as a display and keypad of the device 1300, the sensor assembly 1314 may also detect a change in the position of the device 1300 or a component of the device 1300, the presence or absence of user contact with the device 1300, orientation or acceleration/deceleration of the device 1300, and a change in the temperature of the device 1300. The sensor assembly 1314 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 1314 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1314 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 1316 is configured to facilitate communications between the device 1300 and other devices in a wired or wireless manner. The device 1300 may access a wireless network based on a communication standard, such as WiFi, or a mobile communication network such as 2G, 3G, 4G/LTE, 5G, etc. In an exemplary embodiment, the communication component 1316 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 1316 also includes a Near Field Communications (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the device 1300 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 1304 comprising instructions, executable by the processor 1320 of the device 1300 to perform the methods provided by the aspects of the present disclosure is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are only illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The live broadcasting method, the live broadcasting device and the electronic equipment provided by the application are introduced in detail, a specific example is applied in the text to explain the principle and the implementation of the application, and the description of the embodiment is only used for helping to understand the method and the core idea of the application; meanwhile, for a person skilled in the art, according to the idea of the present application, the specific embodiments and the application range may be changed. In view of the above, the description should not be taken as limiting the application.

Claims (40)

1. A live broadcast method, comprising:
a first service end receives a request for creating multi-language live broadcast submitted by a first client end;
after the multi-language live broadcast is successfully established, obtaining a translated target live broadcast stream corresponding to at least one target language according to a source live broadcast stream acquired by the first client;
after receiving a request for pulling the live stream submitted by a second client, determining a target language required by a user associated with the second client, and providing the target live stream corresponding to the target language for the second client to play.
2. The method of claim 1,
the obtaining of the translated target live broadcast stream corresponding to at least one target language according to the source live broadcast stream acquired by the first client includes:
generating a first address and at least one second address, the at least one second address corresponding to at least one target language;
providing the first address to the first client, so that the first client stores the generated source live broadcast stream to the first address after the multi-language live broadcast creation is successful;
providing the first address and at least one second address to a second server, so that the second server can obtain the source live broadcast stream from the first address and respectively store the source live broadcast stream and the translated target live broadcast stream to the second address after obtaining the translated target live broadcast stream corresponding to at least one target language;
the providing the target live stream corresponding to the target language to the second client for playing includes:
and returning a second address corresponding to the target language to the second client, so that the second client can obtain the translated target live stream corresponding to the target language from the second address to play.
3. The method of claim 2,
the providing the first address and the at least one second address to a second server includes:
creating and starting at least one program director service in a second server by calling an interface provided by the second server, wherein the at least one program director service corresponds to the at least one target language respectively; the director station service carries out streaming voice recognition on the source live broadcast stream by calling streaming voice recognition service and translation service, and generates a translated target live broadcast stream by converging the source live broadcast stream and the translation result after obtaining a translation result;
the request for calling the interface carries information of the first address and the second address.
4. The method of claim 3,
the broadcasting guide station service carries out streaming voice recognition on the source live broadcast stream and acquires a translation result by calling a streaming voice recognition service and a translation service provided by a third service terminal;
the director service carries a third address in the call request so as to store the translation result to the third address, and merges the source live stream of the first address and the translation result of the third address to generate a translated target live stream which is stored to the second address.
5. The method of claim 3,
the multi-language live broadcast comprises a live broadcast established in a commodity object information service system;
the translation service translates the voice recognition result according to a pre-established translation model, and the translation model is obtained by training with historical live broadcast records in the commodity object information service system as training data.
6. The method of claim 5,
and the translation service also translates the voice recognition result according to the pre-stored translation information of the special vocabulary related to the introduction of the commodity object.
7. The method of claim 3,
the translation service also adjusts a sentence structure of the speech recognition result before translating the speech recognition result.
8. The method of claim 2, further comprising:
determining source language information associated with the live broadcast according to information carried in the request for creating the multi-language live broadcast;
and providing the source language information to the second server.
9. The method of claim 2,
the translated target live stream comprises: the live stream is associated with the subtitle corresponding to the target language;
the method further comprises the following steps:
and providing information of the caption display related parameters to the second server, so that after the director service acquires the translated text stream corresponding to the target language, the text stream is added to the caption information of the source live stream according to the caption display related parameters to generate a corresponding target live stream.
10. The method of claim 9, further comprising:
acquiring the resolution ratio of the terminal equipment associated with the first client and/or screen direction information required by a live broadcast process;
and determining the caption display related parameters according to the resolution and/or the screen direction information required by the live broadcast process.
11. The method of claim 10, further comprising:
and acquiring live scene information associated with the multi-language live broadcast, and providing suggestion information about the screen direction to the first client according to the live scene information.
12. The method of claim 9, further comprising:
acquiring background image information of a live broadcast picture associated with the multi-language live broadcast;
and determining the caption display related parameters according to the live broadcast picture background image information.
13. The method of claim 9,
the caption display related parameters comprise one or more of the following parameters: subtitle layout parameters, position, height, size of a subtitle frame, background color, upper limit of word number, subtitle font, size, and appearance duration.
14. The method of claim 2, further comprising:
and according to the access condition of the second address, counting the watching conditions of the users in the country/region associated with the at least one target language on the multi-language live broadcast respectively, and providing a counting result for the first client.
15. The method of claim 1,
the translated target live stream comprises: and associating the live stream with the subtitle corresponding to the target language, or associating the live stream with the voice corresponding to the target language.
16. The method of claim 1,
the multi-language live broadcast comprises a live broadcast established in a commodity object information service system;
the determining a target language required by the user associated with the second client comprises:
and determining a target language required by the user associated with the second client according to the data generated by the user associated with the second client in the commodity object information service system.
17. The method of claim 16,
the determining a target language required by the user associated with the second client comprises:
determining the country/region where the user associated with the second client is located according to the receiving address information corresponding to the user associated with the second client;
and determining the user associated with the second client according to the country/region.
18. The method of claim 1,
when a request of a second client for sharing the live broadcast is received, determining a target language required by a shared target user, and providing address information of a target live broadcast stream corresponding to the target language to the second client, so that the second client can share the address to the target user.
19. A live stream processing method is characterized by comprising the following steps:
the second service end creates at least one program guide service according to the request submitted by the first service end; the request is submitted after the first service terminal receives a request for creating the multi-language live broadcast; the at least one director service corresponds to at least one target language;
acquiring a first address and at least one second address provided by the first service end, wherein the first address is used for storing a source live stream of the live broadcast, and the at least one second address corresponds to at least one target language;
and after the multi-language live broadcast is successfully created, starting a broadcast guide station service, wherein the broadcast guide station service is used for reading the source live broadcast stream from the first address, carrying out streaming voice recognition on the source live broadcast stream by calling a streaming voice recognition service and a translation service, converging the source live broadcast stream and the translation result to generate a translated target live broadcast stream corresponding to a target language, and storing the translated target live broadcast stream corresponding to the target language to a second address corresponding to the target language.
20. The method of claim 19,
the director station service is specifically configured to invoke a streaming voice recognition service and a translation service provided by a third server, generate a third address, and provide the first address and the third address to the third server, so that the third server saves the third address after obtaining a translation result; and the director station service reads the translation result through the third address, synthesizes the translation result with the source live stream and generates the target live stream.
21. The method of claim 19,
the translation result comprises a translated text stream;
the director station service is specifically configured to add a text stream as subtitle information of the source live stream to generate a corresponding target live stream.
22. The method of claim 19,
the translation result comprises a translated voice stream;
the director station service is specifically configured to delete a voice stream from the source live stream, and synthesize the voice stream with the translated voice stream to generate the target live stream.
23. A live stream processing method is characterized by comprising the following steps:
the third server side creates a streaming voice recognition service and a translation service according to a call request of the second server side, wherein the request carries target language information, a first address and a third address, and the first address is used for storing a source live broadcast stream;
reading the source live stream from the first address, and performing voice recognition on the source live stream through the streaming voice recognition service;
and translating the voice recognition result through a translation service to obtain a translation result corresponding to the target language, and storing the translation result in the third address, so that the second server acquires the translation result from the third address and synthesizes the translation result and the source live stream into a target live stream corresponding to the target language.
24. The method of claim 23,
the live broadcast comprises a live broadcast established in a commodity object information service system;
the translation service translates the voice recognition result according to a pre-established translation model, and the translation model is obtained by training with historical live broadcast records in the commodity object information service system as training data.
25. The method of claim 23,
and the translation service also translates the voice recognition result according to the pre-stored translation information of the special vocabulary related to the introduction of the commodity object.
26. A live broadcast method, comprising:
a first client receives a request for creating multi-language live broadcast;
submitting the request to a first service end, and receiving a first address returned by the first service end;
and after the live broadcast is successfully created, submitting the generated live broadcast stream to the first address so as to obtain the source live broadcast stream from the first address and obtain a translated target live broadcast stream corresponding to at least one target language for providing to a second client associated with a user with a target language requirement.
27. The method of claim 26, further comprising:
providing an operation option for selecting a source language associated with the source live broadcast;
and submitting the source language information received through the operation options to the first server.
28. The method of claim 26, further comprising:
receiving statistical information provided by the first service terminal, wherein the statistical information comprises: the users of the country/region associated with the at least one target language respectively watch the multi-language live broadcast;
and displaying the statistical information.
29. A method for acquiring a live stream, comprising:
the second client side submits a request for acquiring the live stream to the first service side;
receiving a second address provided by the first service end, wherein the second address is determined according to a target language required by a user associated with the second client end, and the second address stores a translated target live stream corresponding to the target language;
and pulling the target live stream through the second address and playing the target live stream.
30. The method of claim 29, further comprising:
providing an operation option for reselecting the target language;
and submitting the target language reselected through the operation options to the first service end so that the first service end provides a second address corresponding to the reselected target language.
31. The method of claim 29, further comprising:
providing an operation option for turning on or turning off a multi-language live translation function;
the submitting a request for acquiring the live stream to the first service end includes:
and if the live broadcast translation function is in an open state, submitting a request for acquiring the translated target live broadcast stream to the first service terminal.
32. The method of claim 31, further comprising:
and if the live broadcast translation function is in a closed state, submitting a request for acquiring a source live broadcast stream to the first service terminal so as to play the source live broadcast stream.
33. The method of claim 29, further comprising:
providing an operation option for sharing the live broadcast;
after receiving a sharing request through the operation options, determining a target language required by a sharing object, and submitting the sharing request and the target language required by the sharing object to the first service end;
and after receiving a second address corresponding to the target language required by the shared object and returned by the first service end, providing the second address to the client associated with the shared object.
34. A live broadcast device, applied to a first service end, includes:
the request receiving unit is used for receiving a request for creating the multi-language live broadcast submitted by a first client;
a target live broadcast stream obtaining unit, configured to obtain, according to a source live broadcast stream acquired by the first client, a translated target live broadcast stream corresponding to at least one target language after the multilingual live broadcast is successfully created;
and the target live stream providing unit is used for determining a target language required by a user associated with the second client after receiving a request for pulling the live stream submitted by the second client, and providing the target live stream corresponding to the target language for the second client to play.
35. A live stream processing device is applied to a second server and comprises:
the system comprises a program guide station service establishing unit, a program guide station service establishing unit and a program guide station service establishing unit, wherein the program guide station service establishing unit is used for establishing at least one program guide station service according to a request submitted by a first service end; the request is submitted after the first service terminal receives a request for creating the multi-language live broadcast; the at least one director service corresponds to at least one target language;
an address obtaining unit, configured to obtain a first address and at least one second address provided by the first service end, where the first address is used to store a source live stream of the live broadcast, and the at least one second address corresponds to at least one target language;
and the director station service starting unit is used for starting the director station service after the multi-language live broadcast is successfully created, wherein the director station service is used for reading the source live broadcast stream from the first address, performing streaming voice recognition on the source live broadcast stream by calling streaming voice recognition service and translation service, acquiring a translation result corresponding to one target language, merging the source live broadcast stream and the translation result to generate a translated target live broadcast stream corresponding to the target language, and storing the translated target live broadcast stream to a second address corresponding to the target language.
36. A live stream processing device is applied to a third server and comprises:
the service creating unit is used for creating a streaming voice recognition service and a translation service according to a call request of a second service end, wherein the request carries target language information, a first address and a third address, and the first address is used for storing a source live stream;
the voice recognition unit is used for reading the source live stream from the first address and carrying out voice recognition on the source live stream through the streaming voice recognition service;
and the translation unit is used for translating the voice recognition result through translation service to obtain a translation result corresponding to the target language, and storing the translation result in the third address, so that the second server acquires the translation result from the third address and synthesizes the translation result and the source live stream into a target live stream corresponding to the target language.
37. A live broadcast device applied to a first client comprises:
the request receiving unit is used for receiving a request for creating the multi-language live broadcast;
the request submitting unit is used for submitting the request to a first service end and receiving a first address returned by the first service end;
and the stream pushing unit is used for submitting the generated live stream to the first address after the live stream is successfully created so as to obtain the source live stream from the first address and obtain a translated target live stream corresponding to at least one target language, so as to provide the translated target live stream to a second client associated with a user with a target language requirement.
38. An apparatus for acquiring a live stream, applied to a second client, includes:
the request submitting unit is used for submitting a request for acquiring the live stream to the first service terminal;
an address obtaining unit, configured to receive a second address provided by the first service end, where the second address is determined according to a target language required by a user associated with the second client, and the second address stores a translated target live stream corresponding to the target language;
and the stream pulling unit is used for pulling the target live stream through the second address and playing the target live stream.
39. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 33.
40. An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform the steps of the method of any of claims 1 to 33.
CN202010733464.9A 2020-07-27 2020-07-27 Live broadcast method and device and electronic equipment Active CN113301357B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010733464.9A CN113301357B (en) 2020-07-27 2020-07-27 Live broadcast method and device and electronic equipment
PCT/CN2021/107766 WO2022022370A1 (en) 2020-07-27 2021-07-22 Live streaming method and apparatus, and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010733464.9A CN113301357B (en) 2020-07-27 2020-07-27 Live broadcast method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN113301357A true CN113301357A (en) 2021-08-24
CN113301357B CN113301357B (en) 2022-11-29

Family

ID=77318168

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010733464.9A Active CN113301357B (en) 2020-07-27 2020-07-27 Live broadcast method and device and electronic equipment

Country Status (2)

Country Link
CN (1) CN113301357B (en)
WO (1) WO2022022370A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113452935A (en) * 2021-08-31 2021-09-28 成都索贝数码科技股份有限公司 Horizontal screen and vertical screen live video generation system and method
CN114501042A (en) * 2021-12-20 2022-05-13 阿里巴巴(中国)有限公司 Cross-border live broadcast processing method and electronic equipment
CN114745595A (en) * 2022-05-10 2022-07-12 上海哔哩哔哩科技有限公司 Bullet screen display method and device
CN114866822A (en) * 2022-05-10 2022-08-05 上海哔哩哔哩科技有限公司 Live broadcast stream pushing method and device and live broadcast stream pulling method and device
CN116847113A (en) * 2023-06-20 2023-10-03 联城科技(河北)股份有限公司 Video live broadcast transfer system and method based on cloud architecture module

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116016964A (en) * 2022-12-02 2023-04-25 京东科技信息技术有限公司 Live broadcast steady flow method, device, equipment and computer readable storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106340294A (en) * 2016-09-29 2017-01-18 安徽声讯信息技术有限公司 Synchronous translation-based news live streaming subtitle on-line production system
CN108401192A (en) * 2018-04-25 2018-08-14 腾讯科技(深圳)有限公司 Video stream processing method, device, computer equipment and storage medium
CN108566558A (en) * 2018-04-24 2018-09-21 腾讯科技(深圳)有限公司 Video stream processing method, device, computer equipment and storage medium
CN108737845A (en) * 2018-05-22 2018-11-02 北京百度网讯科技有限公司 Processing method, device, equipment and storage medium is broadcast live
CN110111775A (en) * 2019-05-17 2019-08-09 腾讯科技(深圳)有限公司 A kind of Streaming voice recognition methods, device, equipment and storage medium
CN110636323A (en) * 2019-10-15 2019-12-31 博科达(北京)科技有限公司 Global live broadcast and video on demand system and method based on cloud platform
CN110769265A (en) * 2019-10-08 2020-02-07 深圳创维-Rgb电子有限公司 Simultaneous caption translation method, smart television and storage medium
CN111191472A (en) * 2019-12-31 2020-05-22 湖南师语信息科技有限公司 Teaching auxiliary translation learning system and method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106921873A (en) * 2017-02-28 2017-07-04 北京小米移动软件有限公司 Live-broadcast control method and device
WO2019040400A1 (en) * 2017-08-21 2019-02-28 Kudo, Inc. Systems and methods for changing language during live presentation
CN110730952B (en) * 2017-11-03 2021-08-31 腾讯科技(深圳)有限公司 Method and system for processing audio communication on network

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106340294A (en) * 2016-09-29 2017-01-18 安徽声讯信息技术有限公司 Synchronous translation-based news live streaming subtitle on-line production system
CN108566558A (en) * 2018-04-24 2018-09-21 腾讯科技(深圳)有限公司 Video stream processing method, device, computer equipment and storage medium
CN108401192A (en) * 2018-04-25 2018-08-14 腾讯科技(深圳)有限公司 Video stream processing method, device, computer equipment and storage medium
CN108737845A (en) * 2018-05-22 2018-11-02 北京百度网讯科技有限公司 Processing method, device, equipment and storage medium is broadcast live
CN110111775A (en) * 2019-05-17 2019-08-09 腾讯科技(深圳)有限公司 A kind of Streaming voice recognition methods, device, equipment and storage medium
CN110769265A (en) * 2019-10-08 2020-02-07 深圳创维-Rgb电子有限公司 Simultaneous caption translation method, smart television and storage medium
CN110636323A (en) * 2019-10-15 2019-12-31 博科达(北京)科技有限公司 Global live broadcast and video on demand system and method based on cloud platform
CN111191472A (en) * 2019-12-31 2020-05-22 湖南师语信息科技有限公司 Teaching auxiliary translation learning system and method

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113452935A (en) * 2021-08-31 2021-09-28 成都索贝数码科技股份有限公司 Horizontal screen and vertical screen live video generation system and method
CN113452935B (en) * 2021-08-31 2021-11-09 成都索贝数码科技股份有限公司 Horizontal screen and vertical screen live video generation system and method
CN114501042A (en) * 2021-12-20 2022-05-13 阿里巴巴(中国)有限公司 Cross-border live broadcast processing method and electronic equipment
CN114745595A (en) * 2022-05-10 2022-07-12 上海哔哩哔哩科技有限公司 Bullet screen display method and device
CN114866822A (en) * 2022-05-10 2022-08-05 上海哔哩哔哩科技有限公司 Live broadcast stream pushing method and device and live broadcast stream pulling method and device
CN114745595B (en) * 2022-05-10 2024-02-27 上海哔哩哔哩科技有限公司 Bullet screen display method and device
CN114866822B (en) * 2022-05-10 2024-04-09 上海哔哩哔哩科技有限公司 Live broadcast push stream method and device, and live broadcast pull stream method and device
CN116847113A (en) * 2023-06-20 2023-10-03 联城科技(河北)股份有限公司 Video live broadcast transfer system and method based on cloud architecture module
CN116847113B (en) * 2023-06-20 2024-03-12 联城科技(河北)股份有限公司 Video live broadcast transfer method, device, equipment and medium based on cloud architecture module

Also Published As

Publication number Publication date
CN113301357B (en) 2022-11-29
WO2022022370A1 (en) 2022-02-03

Similar Documents

Publication Publication Date Title
CN113301357B (en) Live broadcast method and device and electronic equipment
CN111970533B (en) Interaction method and device for live broadcast room and electronic equipment
CN104469437A (en) Advertisement pushing method and device
CN108962220B (en) Text display method and device in multimedia file playing scene
CN111626807A (en) Commodity object information processing method and device and electronic equipment
CN111866596A (en) Bullet screen publishing and displaying method and device, electronic equipment and storage medium
CN109413478B (en) Video editing method and device, electronic equipment and storage medium
CN107959864B (en) Screen capture control method and device
US11545188B2 (en) Video processing method, video playing method, devices and storage medium
CN113301363B (en) Live broadcast information processing method and device and electronic equipment
US20220201195A1 (en) Image acquisition method, device, apparatus and storage medium
CN107229403B (en) Information content selection method and device
US20240320256A1 (en) Method, apparatus, device, readable storage medium and product for media content processing
US20220078221A1 (en) Interactive method and apparatus for multimedia service
CN108028966B (en) Video providing device, video providing method, and computer program
CN106331830A (en) Method, device, equipment and system for processing live broadcast
CN113886612A (en) Multimedia browsing method, device, equipment and medium
CN112532931A (en) Video processing method and device and electronic equipment
CN112151041B (en) Recording method, device, equipment and storage medium based on recorder program
CN105744338B (en) A kind of method for processing video frequency and its equipment
CN107247794B (en) Topic guiding method in live broadcast, live broadcast device and terminal equipment
CN114302221A (en) Virtual reality equipment and screen-casting media asset playing method
CN114501042B (en) Cross-border live broadcast processing method and electronic equipment
CN111984767A (en) Information recommendation method and device and electronic equipment
CN114464186A (en) Keyword determination method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240220

Address after: # 01-21, Lai Zan Da Building 1, 51 Belarusian Road, Singapore

Patentee after: Alibaba Singapore Holdings Ltd.

Country or region after: Singapore

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Patentee before: ALIBABA GROUP HOLDING Ltd.

Country or region before: United Kingdom

TR01 Transfer of patent right