WO2024060134A1 - 信息处理方法、装置、电子设备和计算机可读存储介质 - Google Patents

信息处理方法、装置、电子设备和计算机可读存储介质 Download PDF

Info

Publication number
WO2024060134A1
WO2024060134A1 PCT/CN2022/120543 CN2022120543W WO2024060134A1 WO 2024060134 A1 WO2024060134 A1 WO 2024060134A1 CN 2022120543 W CN2022120543 W CN 2022120543W WO 2024060134 A1 WO2024060134 A1 WO 2024060134A1
Authority
WO
WIPO (PCT)
Prior art keywords
video stream
stream data
client
data
playback
Prior art date
Application number
PCT/CN2022/120543
Other languages
English (en)
French (fr)
Inventor
石奇峰
褚虓
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Priority to PCT/CN2022/120543 priority Critical patent/WO2024060134A1/zh
Publication of WO2024060134A1 publication Critical patent/WO2024060134A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast

Definitions

  • Embodiments of the present disclosure relate to an information processing method, an apparatus, an electronic device, and a computer-readable storage medium.
  • Streaming media technology has become the mainstream technology for audio and video transmission.
  • Streaming media technology enables watching and listening while downloading, instead of waiting for the entire audio and video file to be downloaded to your computer before you can watch it.
  • At least one embodiment of the present disclosure provides an information processing method, including: receiving a first initial application from a first client and responding to the first initial application, wherein the first initial application includes that the first client supports playback the playback parameters of the video stream data; after responding to the first initial application, receive the task instruction of the first client; obtain the first video stream data according to the task instruction; convert the first video stream data for the second video stream data, such that the second video stream data has the playback parameter; and providing the second video stream data to the first client so that the second video stream data is displayed in the first client playback.
  • receiving the first initial application from the first client and responding to the first initial application includes: receiving the first initial application from the first client; In response to the first initial application, a working thread is created, and an input buffer area and an output buffer area are opened for the working thread.
  • the input buffer area is configured to store the first video stream data
  • the working thread is configured to store the first video stream data.
  • the input buffer area obtains the first video stream data and processes the first video stream data to obtain the second video stream data
  • the output buffer area is configured to receive the first video stream data provided by the working thread. two video stream data, and provide the second video stream data to the first client.
  • obtaining the first video stream data according to the task instruction includes: determining a method for obtaining the first video stream data according to the task instruction, and obtaining the first video stream data in the described manner.
  • determining an acquisition method for obtaining the first video stream data according to the task instruction, and obtaining the first video stream data according to the method includes: In response to the task instruction indicating that the acquisition method is a uniform resource locator method, obtain the uniform resource locator from the task instruction; obtain the first video stream data according to the uniform resource locator; and transfer the first video stream data to the uniform resource locator.
  • Video stream data is stored in the input buffer area.
  • determining an acquisition method for obtaining the first video stream data according to the task instruction, and obtaining the first video stream data according to the method includes: In response to the task instruction indicating that the acquisition mode is the buffer mode, the first video stream data is extracted from the task instruction; and the first video stream data is stored in the input buffer area.
  • determining an acquisition method for obtaining the first video stream data according to the task instruction, and obtaining the first video stream data according to the method includes: In response to the task instruction indicating that the acquisition mode is the fragmented byte stream mode, multiple task sub-instructions provided from the first client are sequentially received, wherein the multiple task sub-instructions respectively include the first video stream data Different parts in; and sequentially extract partial video stream data in the first video stream data from multiple task sub-instructions, and sequentially store the partial video stream data in the input buffer area.
  • the first video stream data is converted into the second video stream data, so that the second video stream data has the playback parameters, including:
  • the working thread reads the first video stream data from the input buffer area and stores the first video stream data into the processing queue of the working thread; and according to the playback parameters, the working thread Decode and encode the first video stream data in the processing queue to obtain the second video stream data.
  • the second video stream data is provided to the first client so that the second video stream data is played on the first client, including: generating encoding information according to the encoding operation performed by the working thread on the first video stream data; storing the encoding information at the head of the output buffer area so that the first data packet output by the output buffer area contains the encoding information; writing the second video stream data into the output buffer area; and providing multiple data packets from the output buffer area to the first client in sequence, wherein the multiple data packets include the encoding information and the second video stream data.
  • writing the second video stream data into the output buffer area includes: converting the basic playback information of key frames in the second video stream data into Encapsulating it into an obfuscation packet allows the first client to parse the obfuscation packet to obtain the basic playback information.
  • the obfuscation packet includes a plurality of randomly generated bytes.
  • the working thread includes a read lock, and the read lock is triggered when the input buffer area is full. In response to the read lock being triggered, The working thread no longer reads the first video stream data from the input buffer area.
  • the working thread further includes a write lock.
  • the write lock is triggered when the output buffer area is full.
  • the working thread no longer writes the second video stream data to the output buffer area, and the working thread no longer reads the first video stream data from the input buffer area.
  • it further includes: obtaining the reading speed of the first video stream data in the input buffer area; obtaining the working thread to convert the first video stream data is the transcoding speed of the second video stream data; obtains the writing speed at which the output memory outputs the second video stream data; and determines whether the transcoding speed is greater than the writing speed and the writing speed is greater than the reading speed.
  • Speed in response to the reading speed, the transcoding speed and the writing speed not satisfying that the transcoding speed is greater than the writing speed and the writing speed is greater than the reading speed, adjusting the reading speed, the transcoding speed and said write speed.
  • the transcoding code rate is divided into multiple coding layers whose transcoding code rates are sequentially reduced, in response to the reading speed, the transcoding speed and the writing speed.
  • the speed does not satisfy whether the transcoding speed is greater than the writing speed and whether the writing speed is greater than the reading speed.
  • Adjusting the reading speed, the transcoding speed and the writing speed includes: in response to the transcoding If the speed is lower than the writing speed, starting from the next key frame of the first video stream data, the transcoding code rate is adjusted to the transcoding code rate corresponding to the next coding layer of the coding layer.
  • it further includes: obtaining a player instruction of the first client; and controlling the second video stream data in the first client according to the player instruction.
  • the client's playback status For example, in the information processing method provided by an embodiment of the present disclosure, it further includes: obtaining a player instruction of the first client; and controlling the second video stream data in the first client according to the player instruction. The client's playback status.
  • the player instructions include at least one of the following: a pause play instruction, a start play instruction, a double-speed play instruction, a reset to initial state instruction, and a jump instruction.
  • the information processing method further includes: before receiving the first initial application from the first client, establishing a two-way data channel with the first client; and in response to the first client. After the initial application, a corresponding relationship between the two-way data channel, the output buffer area, the input buffer area, the verification information and the working thread is established, so as to interact with the first client according to the corresponding relationship.
  • the information processing method also includes: in response to receiving a second initialization application provided by a second client, obtaining information to be verified from the second initialization application; and in response to the information to be verified being consistent with the verification information, obtaining an output buffer area corresponding to the verification information according to the corresponding relationship, and providing the second video stream data to the second client from the output buffer area.
  • the information processing method provided by an embodiment of the present disclosure further includes: monitoring the read lock and write lock of the working thread through the controller, and obtaining the read lock event and write lock event; and The controller provides the read lock event and the write lock event to the first client.
  • the controller regularly provides messages to the working thread; the method also includes: responding to the controller not receiving the working thread within a preset time period. In response to the message, the worker thread is cleared.
  • the working thread decodes and encodes the first video stream data in the processing queue to obtain the second video stream data, including: The working thread loads the first video stream data in the processing queue to the codec; the codec decodes and encodes the first video stream data to obtain the second video stream data.
  • it further includes: in response to an exception occurring in the first video stream data before entering the codec, the working thread sends a rollback to the controller. event; and the controller returns the first video stream data to the input buffer in response to the rollback event.
  • the method further includes: in response to an internal processing exception of the codec, the working thread requests the controller to mark the working thread as a zombie thread.
  • the method further includes: in response to an exception occurring after the codec processes the first video stream data, the working thread sends a packet loss to the controller. event, the controller sends a packet loss reminder to the first client.
  • At least one embodiment of the present disclosure provides an information processing method, comprising: sending an initialization request to a server, wherein the initialization request includes playback parameters of video stream data supported for playback; after the server responds to the initialization request, sending a task instruction to the server; and receiving second video stream data provided by the server, and playing the second video stream data, wherein the second video stream data is obtained by the server converting first video stream data obtained according to the task instruction.
  • receiving the second video stream data provided by the server and playing the second video stream data includes: receiving the second video stream data provided by the server. Streaming data; parsing the second video stream data to obtain basic playback information in the second video stream data; and playing the second video stream data according to the basic playback information.
  • At least one embodiment of the present disclosure provides an information processing apparatus, including: a first receiving unit configured to receive a first initial application from a first client and respond to the first initial application, wherein the first initial application includes the The first client supports the playback parameters of the video stream data played; the second receiving unit is configured to receive the task instruction of the first client after responding to the first initial application; the instruction acquisition unit is configured to receive the task instruction according to the first client request.
  • the task instruction is to obtain the first video stream data;
  • the conversion unit is configured to convert the first video stream data into the second video stream data, so that the second video stream data has the playback parameters;
  • the providing unit Configured to provide the second video stream data to the first client so that the second video stream data is played on the first client.
  • At least one embodiment of the present disclosure provides an information processing device, including: an application sending unit configured to send an initialization application to a server, wherein the initialization application includes playback parameters that support playback of video stream data; an instruction sending unit configured to After the server responds to the initialization application, send a task instruction to the server; and a playback unit configured to receive the second video stream data provided by the server and play the second video stream data, wherein The second video stream data is obtained by the server converting the first video stream data obtained according to the task instruction.
  • At least one embodiment of the present disclosure provides an electronic device, comprising a processor; a memory, comprising one or more computer program instructions; wherein the one or more computer program instructions are stored in the memory and, when executed by the processor, implement instructions of the information processing method provided by any embodiment of the present disclosure.
  • At least one embodiment of the present disclosure provides a computer-readable storage medium that non-transitoryly stores computer-readable instructions, wherein when the computer-readable instructions are executed by a processor, the information processing provided by any embodiment of the present disclosure is implemented. method.
  • Figure 1 shows a system architecture 100 applied to an information processing method provided by at least one embodiment of the present disclosure
  • Figure 2 shows a flow chart of an information processing method provided by at least one embodiment of the present disclosure
  • Figure 3A shows a method flow chart of S50 in Figure 2 provided by at least one embodiment of the present disclosure
  • FIG. 3B shows an example diagram of multiple coding layers provided by at least one embodiment of the present disclosure
  • Figure 4 shows a flow chart of another information processing method provided by at least one embodiment of the present disclosure
  • Figure 5 shows a flow chart of another information processing method provided by at least one embodiment of the present disclosure
  • Figure 6 shows a flow chart of another information processing method provided by at least one embodiment of the present disclosure.
  • FIG7 shows a flow chart of another information processing method provided by at least one embodiment of the present disclosure.
  • FIG8 is a schematic diagram showing a system architecture 800 of an application information processing method provided by at least one embodiment of the present disclosure
  • Figure 9 shows a schematic diagram of a server architecture 900 that applies an information processing method provided by at least one embodiment of the present disclosure
  • Figure 10 shows a schematic block diagram of an information processing device 1000 provided by at least one embodiment of the present disclosure
  • FIG11 shows a schematic block diagram of another information processing device 1000 provided by at least one embodiment of the present disclosure.
  • Figure 12 shows a schematic block diagram of an electronic device provided by at least one embodiment of the present disclosure
  • Figure 13 shows a schematic block diagram of another electronic device provided by at least one embodiment of the present disclosure.
  • Figure 14 shows a schematic diagram of a computer-readable storage medium provided by at least one embodiment of the present disclosure.
  • H264 is the most widely used
  • H265 has a higher compression ratio
  • av1 has a high cost performance
  • MPEG4 has a higher compression ratio.
  • PCM with better restoration
  • WAV with ease
  • MP3 with common audio.
  • Each coding and decoding algorithm has its own unique characteristics and is used in specific scenarios.
  • the packaging format has derived a variety of formats, such as avi, mov, mkv, flv, and the most popular MP4.
  • the administrator wants to view the H264 video stream captured by a camera connected to the network on a web page.
  • all major browser vendors prohibit the Real Time Streaming Protocol (Real Time Streaming). Protocol, rtsp) data packet transmission on the web page. Therefore, because the browser cannot use the real-time streaming protocol, the administrator cannot view the H264 video stream through the web page.
  • the video stream from the camera is an H265 video stream
  • the administrator's terminal does not support the playback of H265 videos (for example, the decoder in the terminal cannot decode H265 videos), resulting in the administrator being unable to use it. View the H265 video stream on the terminal.
  • wearable devices are designed with single-chip microcomputers instead of CPUs and GPUs due to the requirement of lightness and compactness in process design. Therefore, the decoding computing power of wearable devices is weak and cannot decode the camera video, making it difficult to use the wearable device to quickly view the camera video in a certain direction.
  • the terminal in a poor network environment (that is, a weak network environment), the terminal is prone to packet loss, resulting in poor audio and video quality played by the terminal, affecting the viewing experience.
  • the present disclosure provides another information processing method to solve the problem of difficulty in playing video in scenarios such as the browser being unable to use the protocol, the terminal lacking a corresponding decoder, weak decoding computing power, and a weak network environment in the above example.
  • At least one embodiment of the present disclosure provides an information processing method, another information processing method, an information processing apparatus, another information processing apparatus, an electronic device, and a computer-readable storage medium.
  • the information processing method includes: receiving a first initial application from a first client and responding to the first initial application, wherein the first initial application includes playback parameters of video stream data supported by the first client; in After responding to the first initial application, receive a task instruction from the first client; obtain first video stream data according to the task instruction; convert the first video stream data into second video stream data, so that the The second video stream data has the play parameters; and the second video stream data is provided to the first client so that the second video stream data is played on the first client.
  • This information processing method can solve the problem of difficulty in playing audio and video in complex encoding and decoding environments.
  • FIG. 1 shows a system architecture 100 applied to an information processing method provided by at least one embodiment of the present disclosure.
  • the system architecture 100 may include a terminal device 101 , a server 102 and a communication network 103 .
  • the user can use the terminal device 101 to interact with the server 102 through the communication network 103 to receive or send messages.
  • the communication network 103 is a medium used to provide a communication link between the terminal device 101 and the server 102 .
  • the communication network 103 may include various connection types, such as wired or wireless communication links, specifically WIFI, 3G, 4G, 5G, fiber optic cables, etc.
  • the terminal device 101 can be various electronic devices with audio and/or image playback functions, including but not limited to smartphones, tablets, laptops, etc.
  • the terminal device 101 can also be a single-chip microcomputer, a SOC, a browser or a custom player. wait.
  • the embodiment of the present disclosure does not limit the product type of the terminal device 101, and for example, the terminal device can be based on various available operating systems, such as Windows, Android, IOS, etc.
  • Various application programs (APPs) can be installed in the terminal device 101, such as audio and video playback applications, shopping applications, web browser applications, instant messaging tools, etc., or can be downloaded through application platforms (such as WeChat, Alipay, etc.) And run small programs, quick applications, etc.
  • the user can use the audio and video playing application in the terminal device 101 to play music or videos.
  • the server 102 may be a server that performs the information processing method shown in FIG. 2 below.
  • the server 102 may be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud server.
  • the server 102 can be built on a server with an operating system such as centos, debian, freebsd, etc., which has better support for codecs such as ffmpeg. If the server 102 involves a chip, separate software needs to be developed to replace ffmpeg. Toolkit.
  • FIG. 2 shows a flow chart of an information processing method provided by at least one embodiment of the present disclosure.
  • the method may include steps S10 to S50.
  • Step S10 Receive a first initial application from the first client and respond to the first initial application.
  • the first initial application includes playback parameters of video stream data supported by the first client.
  • Step S20 After responding to the first initial application, receive a task instruction from the first client.
  • Step S30 According to the task instruction, obtain the first video stream data.
  • Step S40 Convert the first video stream data into the second video stream data, so that the second video stream data has playback parameters.
  • Step S50 Provide the second video stream data to the first client so that the second video stream data is played on the first client.
  • the information processing method shown in FIG. 2 may be executed by the server 102 in FIG. 1 .
  • the server 102 encodes and decodes the audio and video to be played by the first client, thereby utilizing the computing power provided by the powerful software/hardware of the cloud (i.e., the server) to meet the audio and video playback needs of most terminals, achieving a A cloud player that solves the problem of difficulty in audio and video playback in complex codec environments.
  • the first client may be, for example, an audio and video playback application installed on the terminal device 101 in Figure 1.
  • the first client sends a first initial application to the server 102, and the first initial application includes playback parameters of video stream data that the first client supports playing.
  • the playback parameters are, for example, specific parameter requirements for the first client (for example, the first client's browser playback toolkit) to support playback (high-definition or real-time).
  • Video stream data is, for example, a data stream that can be read, recognized and played by an audio and video player.
  • the first initial application also includes an instruction type.
  • the playback parameters may be encoding parameters.
  • Encoding parameters include, for example, encapsulation format, codec format, encoding level, etc.
  • the playback parameters may include the playback rate, the playback position to which the audio and video jumps, etc.
  • Instruction types can also include encoding and decoding instructions.
  • Playback parameters may include, for example, ffmpeg's avoption parameters, which are used to configure decoding strategies and encoding rules.
  • MP4 encapsulation format usually consists of three boxes, one ftype records the file type, one moov records basic playback information such as frame rate, etc., and one mdata stores the actual media data.
  • Fmp4 is MPEG4's support for mp4 live streaming capabilities. Fmp4 is similar to MP4, but does not require a moov box. Instead, it puts basic information in moof packages one by one, such as ftyp+moof+mdata+moof+mdata. Patterns form a type of streaming media data.
  • the playback parameters include the playback time parameter frag_duration, which is used to represent the playback time of a set of moof+mdata.
  • Playback parameters include encoding level and image quality level. Playback parameters may also include encoding format, etc.
  • the following table 1 discloses a client instruction set provided by at least one embodiment.
  • the first initialization application and task instructions can be generated according to the format of the client instruction set shown in Table 1.
  • the instruction is a data instruction. If the second byte of the instruction is 0, then the instruction is an initialization application instruction.
  • the initialization application instruction includes json format characters. String, json format string includes some playback parameters. For example, the encapsulation format is fmp4, the encoding and decoding format is h264, the encoding level is 42, etc. If the first byte of a command is 1, it means that the command is a player command. If the second byte of the player command is 1, the command is a start playback command.
  • the server 102 in FIG. 1 receives the first initial application sent by the terminal device 101 and responds to the first initial application.
  • step S10 may include: receiving a first initial request from a first client; responding to the first initial request, creating a working thread, and opening an input buffer area and an output buffer area for the working thread.
  • the input buffer area is configured to store the first video stream data.
  • the working thread is configured to obtain the first video stream data from the input buffer area and process the first video stream data to obtain the second video stream data.
  • the output buffer is configured to receive the second video stream data provided by the working thread, and provide the second video stream data to the first client.
  • a worker thread is started, and two pieces of heap memory are opened and shared with the worker thread.
  • One of the two heap memories is used as the input buffer area, and the other is used as the output buffer area.
  • responding to the initial application can also include the worker thread initializing the codec (for example, ffmpeg) working script according to the initialization data, creating the output and relay processing unit and then waiting for the heap memory to read data. Subsequently, the worker thread will continue to try to read data from the heap. The data is read from the memory, and the initialization process is completed.
  • codec for example, ffmpeg
  • the server includes a controller, and the controller starts a working thread and opens two pieces of heap memory to share with the working thread.
  • step S20 for example, after receiving information from the server that the initialization application is completed, the first client sends a task instruction to the server.
  • the task instruction may be, for example, an instruction requesting to play a video.
  • the task instruction may include the acquisition method of the first video data stream, the byte length of the first video data stream, etc.
  • the server 102 receives task instructions from the terminal device.
  • the first video stream data is, for example, a video or picture captured by a camera in the park.
  • the large-screen terminal in the park monitoring room serves as the first client to request to play the video stream captured by the camera.
  • the first video stream data may also be a video stream from a third party, such as a user's live broadcast.
  • step S30 for example, according to the task instruction, a method of obtaining the first video stream data is determined, and the first video stream data is obtained according to the method.
  • determining an acquisition method for obtaining the first video stream data according to the task instruction, and obtaining the first video stream data according to the method includes: responding to the task instruction indicating that the acquisition method is a uniform resource locator method, Obtain the uniform resource locator from the task instruction; obtain the first video stream data according to the uniform resource locator; and store the first video stream data in the input buffer area.
  • This embodiment directly proxies the uniform resource locator (url) to the codec software development toolkit (for example, ffmpeg sdk).
  • the codec software development toolkit for example, ffmpeg sdk.
  • the server accesses the URL to obtain the first video stream data.
  • the method of using the URL to obtain the first video stream data is simple, direct, and easy to implement.
  • determining an acquisition method for obtaining the first video stream data according to the task instruction, and obtaining the first video stream data according to the method includes: in response to the task instruction indicating that the acquisition method is the buffer method, from Extract the first video stream data from the task instruction; and store the first video stream data into the input buffer area.
  • This embodiment has a wide scope of application and high flexibility, and is suitable for situations where the cloud player (ie, server) cannot directly access the multimedia source.
  • the first client encapsulates the audio and video data and sends it to the cloud.
  • the first client encapsulates the audio and video data by itself, thereby obtaining a buffer byte stream. And send the buffer byte stream to the cloud player.
  • the audio and video data can be directly encapsulated in the task instruction, so that the cloud player directly reads the buffer byte stream in the task instruction to obtain the first video stream data, and converts the first video stream data Store in the input buffer area.
  • CUSTOM_IO is the input and output processor of ffmpeg, which is used for user-defined reading execution process.
  • determining an acquisition method for obtaining the first video stream data according to the task instruction, and obtaining the first video stream data according to the method includes: responding to the task instruction indicating that the acquisition method is a fragmented byte stream Method, sequentially receiving multiple task sub-instructions provided by the first client, the multiple task sub-instructions respectively including different parts of the first video stream data; and sequentially extracting the first video stream data from the multiple task sub-instructions. part of the video stream data, and stores part of the video stream data in the input buffer area in sequence.
  • the first client fragments the first video stream data, that is, splits the first video stream data into multiple task sub-instructions, and continuously provides multiple task sub-instructions to the cloud player.
  • Each task sub-instruction The instruction includes different fragments, that is, different parts, of the fragmented first video stream data.
  • the cloud player obtains the fragments of the first video stream data from each task sub-instruction and adds the fragments to the input buffer area.
  • This embodiment needs to provide the reserved memory space size during the first initialization. Otherwise, the reserved memory space will be generated based on the default size (for example, 1G). After that, the first client will continue to write data to this memory.
  • the data can Comes with serial number.
  • the transmission of fragmented video stream data has better real-time performance and is more suitable for live broadcasts.
  • the sub-byte of the data instruction is 1, it means that the first video stream data is obtained by directly pulling it from the URL.
  • the secondary byte of the data instruction is 2, it means that the client provides a byte stream, that is, buffer mode.
  • the client provides partial byte streams in sequence, that is, fragmented byte stream mode.
  • Step S40 may include the working thread reading the first video stream data from the input buffer and storing the first video stream data in the processing queue of the working thread; and according to the playback parameters, the working thread processes the first video stream in the queue.
  • the data is decoded and encoded to obtain second video stream data.
  • This processing queue specifically stores data to be processed until the worker thread releases the read memory lock and pops up the read completion event.
  • the working thread continuously reads the data with the smallest sequence number from the input cache, and wipes the data in the input cache after reading to facilitate subsequent writing to the input cache.
  • the worker thread will be suspended until the input buffer drops to a certain threshold (the threshold can be configured through initialization).
  • the threshold can be configured through initialization.
  • the thread will also be suspended until the output buffer drops to a certain threshold (the threshold can be configured through initialization).
  • the controller will continue to send status data to the first client.
  • the status data may include, for example, the status of the worker thread (eg, read lock locked, read lock released, write lock locked, write lock released, system exception, etc.).
  • the working thread searches for a suitable encoder in ffmpeg for encoding according to the encoding and decoding parameters in the initialization application.
  • a suitable encoder in ffmpeg for encoding according to the encoding and decoding parameters in the initialization application.
  • any commonly used encoder will find the corresponding hardware-accelerated version of the encoder registered in the database.
  • the hardware-accelerated encoder corresponding to the h264 encoder is nvenc_h264 encoder; the hardware-accelerated encoder corresponding to the h265 encoder is hevc_nvenc encoder.
  • the server can store a correspondence table between encoders and hardware-accelerated encoders, so as to find the hardware-accelerated encoder according to the correspondence table, so as to use the hardware-accelerated encoder to improve encoding and decoding speed.
  • the user can input the required target encoder on the client, for example, the target encoder is h264 encoder, etc.
  • the server searches the correspondence table to determine whether there is hardware corresponding to the target encoder based on the target encoder provided by the client. Speed up the encoder. If a hardware acceleration encoder exists, the hardware acceleration encoder is used to encode and decode the first video stream data to improve response speed.
  • a default encoder is provided to the server, so that the server looks up the correspondence table to determine whether there is an encoder corresponding to the default encoder.
  • Hardware accelerated encoder is provided to the server, so that the server looks up the correspondence table to determine whether there is an encoder corresponding to the default encoder.
  • the working thread decodes and encodes the first video stream data in the processing queue to obtain the second video stream data, including: the working thread processes the first video stream data in the queue. Load into the codec; the codec decodes and encodes the first video stream data to obtain the second video stream data.
  • the codec may be, for example, the ffmep codec.
  • the server includes the ffmpeg toolkit, which enables multiple codec functions, such as hardware acceleration, h264 codec, and h265 codec.
  • ffmpeg can also include instruction set acceleration and multi-threading lib packages.
  • the worker thread can utilize the ffmpeg toolkit to decode and encode the first video stream data in the processing queue.
  • the information processing method may further include: in response to an exception occurring in the first video stream data before entering the codec, the working thread sends a rollback event to the controller; and the controller responds to the rollback event, Return the first video stream data to the input buffer area.
  • the worker thread will throw a rollback event, and the controller will return the data in the processing queue to the input buffer, and The worker thread rolls back the read cursor to the previous position, causing the state to be reset.
  • ffmpeg there is a frame sequence inside ffmpeg, which causes the data to be irreversible. Therefore, the data before entering the ffmpeg reader can be rolled back, and the data after coming out of the decoder is encoded (if the user does not set it, it will not Encoding, issued in the form of raw yuv frame/pcm sample) can only enter the output buffer area. Therefore, when reading data from the input buffer on the working thread, there are cursor bits for data identification.
  • the first video stream data can be returned to the input buffer area, thereby reading the correct first video stream data from the input buffer area again, ensuring the correctness of the first video stream data and improving editing Decoding accuracy.
  • the information processing method may further include: in response to an internal processing exception of the codec, the working thread requests the controller to mark the working thread as a zombie thread.
  • the controller When the controller marks a thread as a zombie thread, the controller will kill the thread and clear all related data of the thread in the input buffer area, processing queue and output buffer area. After notifying the destruction event to the terminal, clear the key-value pair, disconnect and destroy the socket connection handle.
  • the information processing method may further include: in response to an exception occurring after the codec processes the first video stream data, the working thread sends a packet loss event to the controller, and the controller sends a packet loss reminder to the first client.
  • the worker thread will throw a packet loss event, and the controller will be asked to notify the terminal of the event.
  • This embodiment can promptly cause packet loss through the first client, and facilitate the first client to process the packet loss event in a timely manner to ensure the accessibility and correctness of the first video stream data.
  • FIG. 3A shows a method flowchart of S50 in FIG. 2 provided by at least one embodiment of the present disclosure.
  • step S50 may include steps S51 to S54.
  • Step S51 Generate encoding information according to the encoding operation performed by the working thread on the first video stream data.
  • Step S52 Store the coding information at the head of the output buffer area, so that the first data packet output by the output buffer area contains the coding information.
  • Step S53 Write the second video stream data into the output buffer area.
  • Step S54 Provide multiple data packets from the output buffer area to the first client in sequence, where the multiple data packets include encoding information and second video stream data.
  • step S51 after decoding and multiplexing (demuxing) the first video stream data, the working thread obtains encoding information, such as encoding level (profile), maximum resolution (level), sequence parameter set (Sequence Paramater Set, SPS), Image Paramater Set (PPS) constrain_set_flag (sps, pps information).
  • encoding level profile
  • maximum resolution level
  • sequence parameter set Sequence Paramater Set, SPS
  • PPS Image Paramater Set
  • constrain_set_flag sps, pps information
  • the encoding information is stored at the head of the output buffer area, and the encoding information cannot be erased before the working thread is destroyed. In this way, any client requesting access to the second video stream data will obtain the encoding information, so that initial preparation or optimization of decoding can be performed locally on the client. Therefore, in this embodiment, the first data packet of the message returned by the cloud player to the first client or other clients includes coding information.
  • the packet containing the coding information is called a codec packet.
  • the working thread continuously encodes and decodes the fragmented first video stream data, and continuously writes the encoded and decoded second video stream data into the output buffer.
  • step S53 may include: encapsulating basic playback information of key frames in the second video stream data into an obfuscation package so that the first client parses the obfuscation package to obtain the basic playback information, wherein the obfuscation package includes multiple randomly generated bytes.
  • Encapsulating the basic playback information of key frames in the second video stream data into obfuscated packets can improve the security of downlink data, that is, any third party cannot decode the video stream without requesting it from the cloud.
  • the obfuscation packet of this disclosure adopts the hybrid mode, that is, if it is a packet containing a key frame, the basic playback information is encapsulated into an obfuscation packet (wild card, WC), otherwise it is sent normally.
  • the WC packet contains necessary data information and requires the cooperation of the client to generate I frames. Due to the special nature of key frames, other reference frames will be distorted if they are missing key frames.
  • the basic playback information of the key frames in the second video stream data is encapsulated into the obfuscation package and can be used as a pluggable plug-in. It exists in form, that is, it is configurable.
  • Wild card box is a kind of confusion package. It is different from other mp4boxes.
  • the first 4 bytes are the string identification bit wc, and the next 8 bytes are token.
  • the token here is the 8-byte unsigned int token obtained when the socket is established.
  • the last 8 bytes are the token.
  • the byte is a random length value, followed by a specific number of bytes of random length, which contains many randomly generated bytes for obfuscation. What follows is the real moof package specific sub-box mfhd etc. However, the client needs to construct container boxes (traf) such as moof and track fragment by itself.
  • multiple data packets are sequentially provided to the first client from the output buffer area, and the multiple data packets include a codec packet and multiple data packets corresponding to the second video stream data.
  • the working thread includes a read lock.
  • the read lock is triggered when the input buffer is full.
  • the working thread no longer reads the first video from the input buffer. streaming data.
  • the working thread also includes a write lock.
  • the write lock is triggered when the output buffer is full.
  • the working thread no longer writes the second time to the output buffer.
  • video stream data, and the worker thread no longer reads the first video stream data from the input buffer.
  • the read lock is triggered when the input buffer is full and the first video stream data is no longer read.
  • the client will receive the read lock information provided by the server.
  • the write lock is triggered when the output buffer is full.
  • the codec in the worker thread is suspended and no longer writes out.
  • the first video stream data is no longer read into the input buffer.
  • the information processing method also includes obtaining the reading speed Vr of the input buffer area to read the first video stream data; obtaining the transcoding speed of the working thread to convert the first video stream data into the second video stream data. Vd; obtain the writing speed Vw of the output memory to output the second video stream data; and determine whether the transcoding speed Vd is greater than the writing speed Vw and whether the writing speed Vw is greater than the reading speed Vr; in response to the reading speed Vr, the transcoding speed Vd and the writing speed Vw does not satisfy that the transcoding speed Vd is greater than the writing speed Vw and the writing speed Vw is greater than the reading speed, adjust the reading speed Vr, the transcoding speed Vd and the writing speed Vw.
  • IO Read (Websocket) KBPS represents the websocket reading speed
  • Input Stream BitRate represents the code rate information in the read stream information header
  • IO Write (Websocket) KBPS represents the websocket writing speed
  • Output Stream BitRate represents the write out stream information header.
  • Code rate information, Decoder(Worker)Read KBPS represents decoding speed.
  • Table 2 below shows the correspondence between the reading speed Vr, transcoding speed Vd, writing speed Vw, and lock events provided by at least one embodiment of the present disclosure.
  • the worker thread When reading, writing and decoding, the worker thread will count the bytes flowing in and out. Every few hundred milliseconds, it needs to determine the opening/release of the read-write lock according to the table on the left and restart the statistics.
  • the order of reading speed, transcoding speed and writing speed is Vd>Vw>Vr, which can ensure that the video stream to the cloud player is encoded, decoded and sent at the fastest speed. Otherwise, the data at one end will be backlogged, causing locking. If a read lock occurs, it means that the writing process or decoding process is relatively slow. If the main reason is a decoder problem, the decoder can alleviate or solve the problem by dynamically modifying the encoding and decoding rate (ie, code rate) of the worker thread.
  • the transcoding speed is divided into multiple encoding layers with sequentially decreasing transcoding speeds, and in response to the reading speed, transcoding speed, and writing speed not meeting the requirements that the transcoding speed is greater than the writing speed and the writing speed is greater than the reading speed, Speed, adjusting the reading speed, transcoding speed and writing speed includes: in response to the transcoding speed being less than the writing speed, starting from the next key frame of the first video stream data, the transcoding bit rate is adjusted to the encoding layer where the transcoding bit rate is located The transcoding code rate corresponding to the next encoding layer.
  • the transcoding code rate can be lowered by another coding layer. That is, in this embodiment, the transcoding code rate can be reduced step by step, thereby increasing the transcoding speed. , smoother.
  • the exception event of the read lock is returned and packets of new first video stream (read-in stream) data are no longer accepted.
  • the first client receives the read lock exception event and must cache it locally or discard the data packet to be sent.
  • dynamic modification of the code rate of the working thread can be enabled by initialization parameter configuration.
  • initialization parameter configuration For example, during initialization, the current read stream code rate will be used as the origin, and N coding layers will be prepared downwards.
  • the code rate of each layer original code rate ⁇ 4 -N .
  • the read lock counter (RC)+1 when RC> a certain threshold. Then start with the next key frame, adjust the code rate of one layer downward, and reset the RC, which means switching to the next layer and regenerating the output stream.
  • FIG. 3B shows an example diagram of multiple coding layers provided by at least one embodiment of the present disclosure.
  • four encoding layers can be prepared, namely the encoding layer with a code rate of 1024kbps, the encoding layer with a code rate of 256kbps, the encoding layer with a code rate of 64kbps, and the encoding layer with a code rate of 16kbps. Coding layer.
  • the code rate at which the working thread encodes the second video stream information into YUV420 format can be selected from 1024kbps, 256kbps, 64kbps and 16kbps.
  • the code rate can be adjusted from 1024kbps to 256kbps.
  • Figure 4 shows a flow chart of another information processing method provided by at least one embodiment of the present disclosure.
  • the information processing method may further include step S60 and step S70 in addition to steps S10 to S50 shown in Fig. 2.
  • Step S60 and step S70 may be performed after step S50, for example.
  • Step S60 Obtain the player instruction of the first client.
  • Step S70 Control the playback state of the second video stream data on the first client according to the player instruction.
  • This method allows the user to control the played second video stream data through the first client, such as controlling the playback speed, pausing playback, starting playback, etc. of the second video stream data, thereby further improving the functions of the cloud player.
  • the player instructions include at least one of the following: a pause play instruction, a start play instruction, a double-speed play instruction, a reset to initial state instruction, and a jump instruction.
  • the pause playback command is used to control the second video stream data to pause playback
  • the start playback command is used to control the second video stream data to start playing
  • the double speed playback command is used to adjust the playback speed of the second video stream data
  • the reset to initial state command is used
  • the jump instruction is used to control the video playback position to which the second video stream data jumps, so as to start playing from the video playback position.
  • a player instruction includes multiple bytes, and a first byte of 1 indicates that the instruction is a player instruction.
  • the second byte indicates the specific play instruction.
  • a subbyte of 0 indicates that the player instruction is to reset to the initial state
  • a subbyte of 1 indicates that the player instruction is to start playing
  • a subbyte of 2 indicates that the player instruction is to pause, etc.
  • Those skilled in the art can set the value of the sub-byte to represent the player command.
  • step S70 the server responds to the player instruction to control the playing speed, pause playing, start playing, etc. of the second video stream data on the first client.
  • Figure 5 shows a flow chart of another information processing method provided by at least one embodiment of the present disclosure.
  • the information processing method may also include step S80 and step S90 .
  • Step S80 may be performed before step S10, for example.
  • Step S80 Before receiving the first initial application from the first client, establish a two-way data channel with the first client.
  • Step S90 After responding to the first initial application, establish a corresponding relationship between the bidirectional data channel, the output buffer area, the input buffer area, the verification information and the working thread, so as to interact with the first client according to the corresponding relationship.
  • the bidirectional data channel may be, for example, a Websocket-based data channel.
  • the first client and server communicate bi-directionally via Websocket.
  • the TCP order may be disordered in high concurrency situations.
  • the embodiments of the present disclosure add an ACK mechanism similar to the TCP handshake. That is, every time an instruction is sent, a unique PING data will be generated with a microsecond timestamp (the default number of online users will not exceed 1,000,000. If there is an overflow, you can consider using other methods such as the snowflake algorithm to generate PING data), control The receiver will be responsible for returning a PONG with the same value as the original command as a response. The terminal can confirm whether the data has arrived according to the PONG acceptance status.
  • the cloud will accept and judge whether the current pointer points to the data. Packet, if the packet has not yet been reached, the write is overwritten. If the worker thread is reading or the subsequent data packet has been read, the data is discarded.
  • step S90 for example, when the client connects, the controller will get the connection handle.
  • the controller When the client sends an initialization request, the controller will initialize the memory, open the input buffer area and output buffer area, create a working thread, and generate verification information. token, and generate a corresponding relationship between the bidirectional data channel, the output buffer area, the input buffer area, the verification information and the working thread in the working thread, and the corresponding relationship is stored in the form of a key-value pair, for example.
  • the information processing method may further include: monitoring the read lock and write lock of the working thread through the controller, and obtaining the read lock event and write lock event; and passing the controller to the first A client provides read lock events and write lock events.
  • the controller listens to each thread lock event at the same time, delivers notifications to each terminal according to the event, and temporarily suspends the read/write package based on the lock situation so that the entire system can operate healthily.
  • the controller will also periodically send messages to the worker thread, and clear the worker thread in response to the controller not receiving a response from the worker thread to the message within a preset time period. For example, a thread that does not receive a receipt within a preset time period (for example, within 1 minute) will be marked as a zombie thread. For the health of the entire system, the controller will safely clear the zombie threads and notify the relevant socket handle to release resources.
  • Figure 6 shows a flow chart of another information processing method provided by at least one embodiment of the present disclosure.
  • the information processing method may also include steps S601 and S602 . Steps S601 and S602 may be executed after step S50, for example.
  • Step S601 In response to receiving the second initialization application provided by the second client, obtain the information to be verified from the second initialization application.
  • Step S602 In response to the information to be verified being consistent with the verification information, obtain the output buffer area corresponding to the verification information according to the corresponding relationship, and provide the second video stream data to the second client from the output buffer area.
  • a client 104 may also be included.
  • Client 104 is an example of a second client.
  • the server After the client 101 sends the first initialization request to the server for the first time, the server returns a token in the first message body.
  • the client 101 can share the token with other users (for example, the client 104), so that the client 104 generates a second initialization application based on the token and sends the second initialization application to the server.
  • the token can be shared with other users, so that other users can access the video stream sent by the server through the token, which is equivalent to multicast video. If there are too many access tokens, the maximum number of multicast connections will be reserved and rejected. Serve other token holders.
  • the task initiator has full authority to handle the task.
  • the server In response to receiving the second initialization application provided by the client 104, the server obtains the information to be verified from the second initialization application.
  • the server responds to the information to be verified that the information to be verified is consistent with the verification information, obtains the output buffer area corresponding to the verification information according to the corresponding relationship, and provides the second video stream data to the second client from the output buffer area.
  • Table 3 below shows a data set of a server provided by at least one embodiment of the present disclosure.
  • the data provided by the server to the client can be generated according to the format of Table 3. For example, in the example of Table 3, if the first byte of the data returned by the server to the client is 1, the data is PONG data; if the 1-byte status bit in the PONG data is 01, it indicates that the first video stream data has entered the input buffer. For another example, if the first byte of the data returned by the server to the client is 0, it indicates that the data is a data stream. If the secondary byte of the data stream is 1, the data stream is an event stream. If the 1-byte three-level classification of the event stream is 02, it indicates that the cloud player is paused. If the 1-byte three-level classification is E0, it indicates that the write lock is triggered.
  • Figure 7 shows a flow chart of another information processing method provided by at least one embodiment of the present disclosure.
  • This information processing method is executed, for example, by the client 101 or the client 104 in FIG. 1 .
  • the information processing method may include steps: S710 to S730 .
  • Step S710 Send an initialization application to the server, where the initialization application includes playback parameters that support the playback of video stream data.
  • Step S720 After the server responds to the initialization request, send a task instruction to the server.
  • Step S730 Receive the second video stream data provided by the server, and play the second video stream data.
  • the second video stream data is obtained by the server converting the first video stream data obtained according to the task instruction.
  • the server encodes and decodes the audio and video to be played by the first client, thereby utilizing the computing power provided by the powerful software/hardware of the cloud (i.e., the server) to meet the audio and video playback needs of most terminals, achieving A cloud player that solves the problem of difficulty in audio and video playback in complex codec environments.
  • the client 101 sends an initialization request to the server.
  • the initialization request may be, for example, the first initialization request described above.
  • the initialization request is, for example, generated according to the instruction set in Table 1 above.
  • the client 104 may also send an initialization application to the server after receiving the token shared by the client 101.
  • step S720 for example, after receiving a response provided by the server, a task instruction is sent to the server.
  • the task instruction may be, for example, requesting the server to provide second video stream data.
  • Step S720 may be, for example, the client 101 or the client 104 pulling the stream.
  • the server In response to receiving the task instruction, the server performs, for example, the above steps S30 to S50 to convert the first video stream data obtained according to the task instruction to obtain the second video stream data.
  • step S730 for example, the client 101 receives the second video stream data and plays the second video stream data.
  • step S730 may include receiving the second video stream data provided by the server; parsing the second video stream data to obtain basic playback information in the second video stream data; and playing according to the basic playback information. Second video stream data.
  • the second video stream data includes obfuscated packets.
  • the first 4 bytes of the obfuscation packet are the string identification bit wc, and the next 8 bytes are the token.
  • the token here is the 8-byte unsigned int token obtained when the socket is established, and the last 8 bytes are the random length value, followed by It is a specific number of bytes of random length, and contains many randomly generated bytes for obfuscation. What follows is the real moof package specific sub-box mfhd, etc., thereby improving security.
  • the client needs to construct moof, track fragment and other container boxes (traf) by itself. For example, let the byte after mfhd be Buf, let the length of Buf be the number of bytes of traf, and record it as BufLen. Convert the utf8 string "mfhd" into a byte array, concatenate it with BufLen, which generates bytes in little-endian mode, and then add Buf. This large byte array is recorded as TrafBuf, and the TrafBuf length base is TrafBufLen, and the traf construction is completed.
  • the Mfhd byte is recorded as MfhdBuf, and the length is recorded as MfhdBufLen.
  • MoofBufLen MfHdBufLen+TrafBufLen; convert the utf string "moof" into a byte array, splice the byte MoofBufLen generated by the little endian, splice it with MfhdBuf, and then splice it with TrafBuf to get the complete Moof package.
  • FIG. 8 shows a schematic diagram of a system architecture 800 for applying an information processing method provided by at least one embodiment of the present disclosure.
  • the system architecture 800 may include a video playback tag 801, a multimedia source cache 802, a browser player toolkit 803, a two-way channel connection 804, a controller 805, an input buffer 806, and multiple worker threads 807 ⁇ 837 and output buffer 808.
  • the video playback tag 801, the multimedia source cache 802, and the browser player toolkit 803 are on the client side, and the controller 805, input cache area 806, multiple working threads 807-837, and output cache area 808 are on the cloud player side. side.
  • multiple working threads 807 to 837 are shown.
  • the multiple working threads 807 to 837 are respectively created for different clients.
  • the following uses the working thread 807 as an example to illustrate the implementation.
  • the client and the cloud player communicate via a bidirectional channel connection 804, which may be, for example, a websocket.
  • the browser player toolkit 803 is, for example, a browser toolkit (Browser Player SDK), configured to collect the first video stream data as a negative value, and send it to the server through websocket according to the specified format (for example, transport stream TS, program stream ps, etc.) Cloud player.
  • a browser toolkit (Browser Player SDK)
  • the specified format for example, transport stream TS, program stream ps, etc.) Cloud player.
  • the controller 805 of the cloud player responds to receiving the first video stream data and sends the first video stream data to the input buffer 806 created in advance.
  • the controller 805 is responsible for connecting the previous and the following.
  • the controller 805 When the socket client connects, the controller 805 will get the connection handle.
  • the controller 805 When the socket client sends an initialization request, the controller 805 will initialize the memory and create a working thread. Generate tokens and generate key-value pairs from these three in the main thread. At the same time, it monitors each thread lock event, delivers notifications to each terminal according to the event, and temporarily suspends the read/write package according to the lock situation, so that the entire system can operate healthily.
  • the controller 805 will also regularly send messages to worker threads. For example, threads that have no receipt within a preset time period will be marked as zombie threads. For the health of the entire system, the controller will safely clear zombie threads and notify relevant sockets. Handle, release resources.
  • the controller 805 is equivalent to the responsibilities of a housekeeper or supervisor.
  • the working thread 807 obtains the first video stream data from the input buffer area 806, and encodes and decodes the first video stream data to obtain the second video stream data.
  • the working thread 807 sends the second video stream data to the output buffer 808 in sequence.
  • worker thread 807 includes the ffmpeg toolkit.
  • ffmpeg initialization data is completed and waits for event indication.
  • the working thread 807 will wait for the actual task instruction.
  • the controller will send the task instruction (for example, 00 01 3E 72 00 74 00 73 00 70 00 (where 00 represents the data instruction, 01 represents the url method, and the url length is 62 bit, followed by rtsp://%) is written to the read memory, and an event is sent to notify the worker thread 807.
  • the output buffer 808 sends the data packet of the second video stream data to the controller 805, and the controller 805 provides the second video stream data to the client through the websocket.
  • controller 805 provides the second video stream data to the client through websocket.
  • the client's browser player toolkit 803 parses the CUSTO data packet, restores it to fmp4, and provides fmp4 to the multimedia source buffer (Media Source Buffer) 802, which provides the fmp4 to the video playback tag 801.
  • the video playback tag 801 is, for example, H5 ⁇ video>.
  • SCW mode is a custom mode for websocket application layer communication, used for two-way notification of streaming messages, thread scheduling management, etc.
  • FIG. 9 shows a schematic diagram of a server architecture 900 that applies an information processing method provided by at least one embodiment of the present disclosure.
  • the server architecture includes a controller 901, an input buffer 902, a worker thread 903, and an output buffer 904.
  • the controller 901 checks the protocol, token, etc. on the fragmented byte stream, and provides the fragmented byte stream that passes the inspection to the input buffer area 902.
  • the internal cache of the working thread 903 continuously reads the byte stream from the input buffer area 902 and stores it in the processing queue of the internal cache.
  • the processing queue of the internal cache includes byte stream 4, byte stream 5 and byte stream 6.
  • the internal cache may be a ring memory, which is provided with two pointers, one is an initial pointer, indicating the starting point of the ring memory, and the other is a current pointer, indicating the byte stream that currently needs to be read. Location. If the cache space between the initial pointer and the current pointer can accommodate a certain byte stream read from the input buffer area, the byte stream is stored in the cache space.
  • the partial cache of the working thread 903 reads the byte stream from the processing queue of the internal cache to encode and decode the byte stream. For example, based on the playback parameters, read lock and status, use the avio callback function of AVFormatCtx to read data from part of the cache to obtain the object AVInputStream and corresponding parameters. After that, use the ffmpeg toolkit to continuously read the AVInputStream stream data, and then hand it over to ffmpeg toolkit for decoding.
  • the object AVPacket Reader in the ffmpeg toolkit is used to read the input stream, and then the object SwsScale/AVFilter is used to filter the input stream or image scaling, and the ffmpeg toolkit uses Halt/Resume to handle transactions such as pause and resume, as well as control Operations such as pending encoding and decoding.
  • the output stream encoded and decoded by the ffmpeg toolkit forms the output stream AVOutputStream through the object AVFormat Ctx.
  • the video stream in the fmp4 format is written to the output buffer area 904.
  • the ffmpeg toolkit recycles objects such as AVPacket Reader, SwsScale/AVFilter, Halt/Resume, and pending to process each byte stream.
  • the first byte of the output buffer area 904 is the above-mentioned codec packet containing encoding information. Please refer to the above description for details.
  • the bytes after the first byte of the output buffer area 904 are fmp4 data packets, which may include, for example, Ftype data packets and Moov data packets. Ftype is used to record file types, and Moov is used to record basic playback information such as frame rate, etc.
  • the output buffer area 904 writes the video stream in fmp4 format to the controller, and the controller provides fmp4 to the client.
  • the embodiment of the present disclosure since it is a bidirectional data channel, it is divided into three parts: handshake security, uplink security and downlink security.
  • the handshake part is symmetric or asymmetric encryption of WSS.
  • Downstream security indicates the security of data sent from the cloud to the client. This specifically refers to the non-reusability of the data, that is, any third party cannot decode the video stream without requesting it from the cloud.
  • the embodiment of the present disclosure uses the mixed mode, that is, if it is a packet containing a key frame, the moof packet is encapsulated into a wild card, otherwise it is sent normally.
  • the obfuscation package contains necessary data information, but the format is different and requires the cooperation of the terminal to generate key frames. The format can be found in the attachment. Due to the special nature of key frames, other reference frames will be distorted if they are missing key frames. Since this part relies on the client, it exists as a pluggable plug-in, which is configurable.
  • Uplink security refers to the security of data uploaded by the client to the cloud. This specifically refers to the inoperability of data, that is, any third party cannot access his data without requesting a token from the terminal.
  • controller 901 is also used to perform thread management on the working thread 903 .
  • FIG. 10 shows a schematic block diagram of an information processing device 1000 provided by at least one embodiment of the present disclosure.
  • the information processing device 1000 includes a first receiving unit 1010 , a second receiving unit 1020 , an instruction acquisition unit 1030 , a conversion unit 1040 and a providing unit 1050 .
  • the first receiving unit 1010 is configured to receive a first initial application from a first client and respond to the first initial application, where the first initial application includes playback parameters of video stream data supported by the first client.
  • the first receiving unit 1010 may, for example, perform step S10 described in FIG. 2 .
  • the second receiving unit 1020 is configured to receive the task instruction of the first client after responding to the first initial application.
  • the second receiving unit 1020 may, for example, execute step S20 described in FIG. 2 .
  • the instruction acquisition unit 1030 is configured to acquire the first video stream data according to the task instruction.
  • the instruction acquisition unit 1030 may, for example, perform step S30 described in FIG. 2 .
  • the conversion unit 1040 is configured to convert the first video stream data into second video stream data, so that the second video stream data has the playback parameters.
  • the conversion unit 1040 may, for example, perform step S40 described in FIG. 2 .
  • the providing unit 1050 is configured to provide the second video stream data to the first client so that the second video stream data is played on the first client.
  • the providing unit 1050 may, for example, execute step S50 described in FIG. 2 .
  • FIG. 11 shows a schematic block diagram of another information processing device 1100 provided by at least one embodiment of the present disclosure.
  • the information processing device 1100 includes an application sending unit 1110 , an instruction sending unit 1120 and a playback unit 1130 .
  • the application sending unit 1110 is configured to send an initialization application to the server, where the initialization application includes playback parameters that support playback of video stream data.
  • the application sending unit 1110 may, for example, perform step S710 described in FIG. 7 .
  • the instruction sending unit 1120 is configured to send a task instruction to the server after the server responds to the initialization application.
  • the instruction sending unit 1120 may, for example, perform step S720 described in FIG. 7 .
  • the playback unit 1130 is configured to receive the second video stream data provided by the server and play the second video stream data.
  • the second video stream data is the server's processing of the first video stream data obtained according to the task instruction. converted.
  • the playback unit 1130 may, for example, perform step S730 described in FIG. 7 .
  • the first receiving unit 1010, the second receiving unit 1020, the instruction acquisition unit 1030, the conversion unit 1040 and the providing unit 1050, the application sending unit 1110, the instruction sending unit 1120 and the playing unit 1130 can be hardware, software, firmware and any feasible combination thereof.
  • the first receiving unit 1010, the second receiving unit 1020, the instruction acquisition unit 1030, the conversion unit 1040 and the providing unit 1050, the application sending unit 1110, the instruction sending unit 1120 and the playing unit 1130 can be dedicated or general circuits, chips or devices, etc., or can be a combination of a processor and a memory.
  • the embodiments of the present disclosure do not limit the specific implementation forms of the above-mentioned units.
  • each unit of the information processing device 1000 or the information processing device 1100 corresponds to each step of the aforementioned information processing method.
  • the information processing device 1000 or the information processing device 1100 corresponds to each step of the aforementioned information processing method.
  • the relevant description of the information processing method will not be repeated here.
  • the components and structures of the information processing device 1000 shown in FIG. 10 and the information processing device 1100 shown in FIG. 11 are only exemplary and not restrictive. As needed, the information processing device 1000 or the information processing device 1100 may also include Other components and structures.
  • At least one embodiment of the present disclosure also provides an electronic device including a processor and a memory including one or more computer program modules.
  • One or more computer program modules are stored in the memory and configured to be executed by the processor, and the one or more computer program modules include instructions for implementing the above-mentioned information processing method.
  • the electronic device can encode and decode the audio and video to be played by the first client by the server, thereby utilizing the computing power provided by the powerful software/hardware of the cloud (i.e., the server) to meet the audio and video playback needs of most terminals, achieving A cloud player is developed that solves the problem of difficulty in audio and video playback in complex codec environments.
  • Figure 12A is a schematic block diagram of an electronic device provided by some embodiments of the present disclosure.
  • the electronic device 1200 includes a processor 1210 and a memory 1220.
  • Memory 1220 is used to store non-transitory computer-readable instructions (eg, one or more computer program modules).
  • the processor 1210 is configured to execute non-transitory computer readable instructions. When the non-transitory computer readable instructions are executed by the processor 1210, they may perform one or more steps in the information processing method described above.
  • Memory 1220 and processor 1210 may be interconnected by a bus system and/or other forms of connection mechanisms (not shown).
  • the processor 1210 may be a central processing unit (CPU), a graphics processing unit (GPU), or other forms of processing units with data processing capabilities and/or program execution capabilities.
  • the central processing unit (CPU) may be of X86 or ARM architecture.
  • the processor 1210 may be a general-purpose processor or a special-purpose processor that may control other components in the electronic device 1200 to perform desired functions.
  • memory 1220 may include any combination of one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • Volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache), etc.
  • Non-volatile memory may include, for example, read-only memory (ROM), hard disk, erasable programmable read-only memory (EPROM), portable compact disk read-only memory (CD-ROM), USB memory, flash memory, and the like.
  • One or more computer program modules may be stored on the computer-readable storage medium, and the processor 1210 may run the one or more computer program modules to implement various functions of the electronic device 1200 .
  • Various application programs and various data, as well as various data used and/or generated by the application programs, etc. can also be stored in the computer-readable storage medium.
  • FIG. 13 is a schematic block diagram of another electronic device provided by some embodiments of the present disclosure.
  • the electronic device 1300 is, for example, suitable for implementing the information processing method provided by the embodiment of the present disclosure.
  • the electronic device 1300 may be a terminal device or the like. It should be noted that the electronic device 1300 shown in FIG. 13 is only an example, which does not bring any limitations to the functions and scope of use of the embodiments of the present disclosure.
  • the electronic device 1300 may include a processing device (eg, central processing unit, graphics processor, etc.) 1310, which may be loaded into a random access device according to a program stored in a read-only memory (ROM) 1320 or from a storage device 1380.
  • the program in the memory (RAM) 1330 performs various appropriate actions and processes.
  • various programs and data required for the operation of the electronic device 1300 are also stored.
  • the processing device 1310, ROM 1320 and RAM 1330 are connected to each other through a bus 1340.
  • An input/output (I/O) interface 1350 is also connected to bus 1340.
  • the following devices may be connected to the I/O interface 1350: input devices 1360 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speaker, vibration An output device 1370 such as a computer; a storage device 1380 including a magnetic tape, a hard disk, etc.; and a communication device 1390.
  • the communication device 1390 may allow the electronic device 1300 to communicate wirelessly or wiredly with other electronic devices to exchange data.
  • FIG. 13 illustrates electronic device 1300 having various means, it should be understood that implementation or provision of all illustrated means is not required and electronic device 1300 may alternatively implement or be provided with more or fewer means.
  • the above-mentioned information processing method can be implemented as a computer software program.
  • an embodiment of the present disclosure includes a computer program product, which includes a computer program carried on a non-transitory computer-readable medium, and the computer program includes a program code for executing the above-mentioned information processing method.
  • the computer program can be downloaded and installed from the network through the communication device 1390, or installed from the storage device 1380, or installed from the ROM 1320.
  • the processing device 1310 the functions defined in the information processing method provided in the embodiment of the present disclosure can be implemented.
  • At least one embodiment of the present disclosure also provides a computer-readable storage medium for storing non-transitory computer-readable instructions, which can implement the above when the non-transitory computer-readable instructions are executed by a computer. information processing methods.
  • the server encodes and decodes the audio and video to be played by the first client, thereby utilizing the computing power provided by the powerful software/hardware of the cloud (i.e., the server) to satisfy the audio and video playback of most terminals.
  • the server encodes and decodes the audio and video to be played by the first client, thereby utilizing the computing power provided by the powerful software/hardware of the cloud (i.e., the server) to satisfy the audio and video playback of most terminals.
  • the server encodes and decodes the audio and video to be played by the first client, thereby utilizing the computing power provided by the powerful software/hardware of the cloud (i.e., the server) to satisfy the audio and video playback of most terminals.
  • the cloud player is implemented, which solves
  • Figure 14 is a schematic diagram of a storage medium provided by some embodiments of the present disclosure. As shown in FIG. 14, storage medium 1400 is used to store non-transitory computer-readable instructions 1410. For example, the non-transitory computer readable instructions 1410, when executed by a computer, may perform one or more steps in the information processing method described above.
  • the storage medium 1400 may be applied to the electronic device 1200.
  • the storage medium 1400 may be the memory 1220 in the electronic device 1200 shown in FIG12.
  • the relevant description of the storage medium 1400 may refer to the corresponding description of the memory 820 in the electronic device 1200 shown in FIG12, and will not be repeated here.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

一种信息处理方法、信息处理装置、电子设备和计算机可读存储介质。该信息处理方法包括:接收第一客户端的第一初始申请并且响应第一初始申请,第一初始申请包括第一客户端支持播放的视频流数据的播放参数(S10);在响应第一初始申请之后,接收第一客户端的任务指令(S20);根据任务指令,获取第一视频流数据(S30);将第一视频流数据转换为第二视频流数据,使得第二视频流数据具有播放参数(S40);以及向第一客户端提供第二视频流数据以便第二视频流数据在第一客户端播放(S50)。该方法可以解决环境复杂编解码场景下的音视频播放时面临播放难的问题。

Description

信息处理方法、装置、电子设备和计算机可读存储介质 技术领域
本公开的实施例涉及一种信息处理方法、装置、电子设备和计算机可读存储介质。
背景技术
随着科学技术和经济的快速发展,音视频逐渐成为我们生活中的一部分。流媒体技术已经成为音视频传输的主流技术。流媒体技术能够实现一边下载一边观看和收听,而不要等整个音视频文件下载到自己的计算机上才可以观看。
发明内容
本公开至少一个实施例提供一种信息处理方法,包括:接收第一客户端的第一初始申请并且响应所述第一初始申请,其中,所述第一初始申请包括所述第一客户端支持播放的视频流数据的播放参数;在响应所述第一初始申请之后,接收所述第一客户端的任务指令;根据所述任务指令,获取第一视频流数据;将所述第一视频流数据转换为第二视频流数据,使得所述第二视频流数据具有所述播放参数;以及向所述第一客户端提供所述第二视频流数据以便所述第二视频流数据在所述第一客户端播放。
例如,在本公开一实施例提供的信息处理方法中,接收所述第一客户端的所述第一初始申请并且响应所述第一初始申请包括:接收所述第一客户端的第一初始申请;响应所述第一初始申请,创建工作线程,并且为所述工作线程开启输入缓存区和输出缓存区,所述输入缓存区配置为存储所述第一视频流数据,所述工作线程配置为从所述输入缓存区获取所述第一视频流数据并且对所述第一视频流数据进行处理得到所述第二视频流数据,所述输出缓存区配置为接收所述工作线程提供的所述第二视频流数据,并且向所述第一客户端提供所述第二视频流数据。
例如,在本公开一实施例提供的信息处理方法中,根据所述任务指令,获取所述第一视频流数据,包括:根据所述任务指令,确定获取所述第一视频流数据的方式,以及按照所述方式获取所述第一视频流数据。
例如,在本公开一实施例提供的信息处理方法中,根据所述任务指令,确定获取所述第一视频流数据的获取方式,以及按照所述方式获取所述第一视频流数据,包括:响应于所述任务指令指示所述获取方式为统一资源定位符方式,从所述任务指令中获取统一资源定位符;根据统一资源定位符获取所述第一视频流数据;以及将所述第一视频流数据存储到所述输入缓存区。
例如,在本公开一实施例提供的信息处理方法中,根据所述任务指令,确定获取所述第一视频流数据的获取方式,以及按照所述方式获取所述第一视频流数据,包括:响应于所述任务指令指示获取方式为缓冲器方式,从所述任务指令中提取所述第一视频流数据;以及将 所述第一视频流数据存储到所述输入缓存区。
例如,在本公开一实施例提供的信息处理方法中,根据所述任务指令,确定获取所述第一视频流数据的获取方式,以及按照所述方式获取所述第一视频流数据,包括:响应于所述任务指令指示获取方式为碎片化字节流方式,依次接收来自所述第一客户端提供的多个任务子指令,其中,多个任务子指令分别包括所述第一视频流数据中的不同部分;以及依次从多个任务子指令中提取所述第一视频流数据中的部分视频流数据,并且将所述部分视频流数据依次存储到所述输入缓存区。
例如,在本公开一实施例提供的信息处理方法中,将所述第一视频流数据转换为所述第二视频流数据,使得所述第二视频流数据具有所述播放参数,包括:所述工作线程从所述输入缓存区中读取所述第一视频流数据并且将所述第一视频流数据存储到所述工作线程的处理队列中;以及根据所述播放参数,所述工作线程对处理队列中的第一视频流数据进行解码和编码得到所述第二视频流数据。
例如,在本公开一实施例提供的信息处理方法中,向所述第一客户端提供所述第二视频流数据以便所述第二视频流数据在所述第一客户端播放,包括:根据所述工作线程对所述第一视频流数据执行的编码操作,生成编码信息;将所述编码信息存储于所述输出缓存区的队首,使得所述输出缓存区输出的首个数据包包含所述编码信息;向所述输出缓存区中写入所述第二视频流数据;以及从所述输出缓存区依次向所述第一客户端提供多个数据包,其中,所述多个数据包包括所述编码信息和所述第二视频流数据。
例如,在本公开一实施例提供的信息处理方法中,向所述输出缓存区中写入所述第二视频流数据,包括:将所述第二视频流数据中的关键帧的播放基础信息封装为混淆包使得所述第一客户端对所述混淆包进行解析得到所述播放基础信息,混淆包包括多个随机生成的字节。
例如,在本公开一实施例提供的信息处理方法中,所述工作线程包括读入锁,所述读入锁在所述输入缓存区存满时触发,响应于所述读入锁被触发,所述工作线程不再从所述输入缓存区中读取所述第一视频流数据。
例如,在本公开一实施例提供的信息处理方法中,所述工作线程还包括写入锁,所述写入锁在所述输出缓存区存满时触发,响应于所述写入锁被触发,所述工作线程不再向所述输出缓存区中写入第二视频流数据,并且所述工作线程不再从所述输入缓存区中读取所述第一视频流数据。
例如,在本公开一实施例提供的信息处理方法中,,还包括:获取所述输入缓存区读入第一视频流数据的读速度;获取所述工作线程将所述第一视频流数据转换为第二视频流数据的转码速度;获取所述输出内存输出所述第二视频流数据的写速度;以及判断所述转码速度是否大于所述写速度且所述写速度大于所述读速度;响应于所述读速度、所述转码速度和所述写速度不满足所述转码速度大于所述写速度且所述写速度大于所述读速度,调整所述读速度、所述转码速度和所述写速度。
例如,在本公开一实施例提供的信息处理方法中,转码码率被划分为转码码率依次降低的多个编码层,响应于所述读速度、所述转码速度和所述写速度不满足所述转码速度是否大于所述写速度且所述写速度是否大于所述读速度,调整所述读速度、所述转码速度和所述写速度包括:响应于所述转码速度小于所述写速度,从所述第一视频流数据的下一个关键帧开始,所述转码码率调整为所在的编码层的下一个编码层对应的转码码率。
例如,在本公开一实施例提供的信息处理方法中,还包括:获取所述第一客户端的播放器指令;以及根据所述播放器指令,控制所述第二视频流数据在所述第一客户端的播放状态。
例如,在本公开一实施例提供的信息处理方法中,播放器指令包括以下至少一种:暂停播放指令、开始播放指令、倍速播放指令、重置为初始状态指令、跳转指令。
例如,在本公开一实施例提供的信息处理方法中,还包括:在接收所述第一客户端的第一初始申请之前,与所述第一客户端建立双向数据通道;在响应所述第一初始申请之后,建立所述双向数据通道、所述输出缓存区、所述输入缓存区、所述验证信息和工作线程的对应关系,以便根据所述对应关系与所述第一客户端交互。
例如,在本公开一实施例提供的信息处理方法中,还包括:响应于接收到第二客户端提供的第二初始化申请,从所述第二初始化申请中获取待验证信息;以及响应于所述待验证信息与所述验证信息一致,根据所述对应关系获取与所述验证信息对应的输出缓存区,从所述输出缓存区向所述第二客户端提供所述第二视频流数据。
例如,在本公开一实施例提供的信息处理方法中,还包括:通过控制器对所述工作线程的读入锁和写入锁进行监听,获取读入锁事件和写入锁事件;以及通过所述控制器向所述第一客户端提供所述读入锁事件和所述写入锁事件。
例如,在本公开一实施例提供的信息处理方法中,,所述控制器定时向工作线程提供消息;所述方法还包括:响应于控制器在预设时间段内未接收到所述工作线程针对所述消息的响应,清除所述工作线程。
例如,在本公开一实施例提供的信息处理方法中,根据所述播放参数,所述工作线程对处理队列中的第一视频流数据进行解码和编码得到所述第二视频流数据,包括:所述工作线程将所述处理队列中的第一视频流数据加载到编解码器;由所述编解码器对第一视频流数据进行解码和编码得到所述第二视频流数据。
例如,在本公开一实施例提供的信息处理方法中,还包括:响应于所述第一视频流数据在进入所述编解码器之前出现异常,所述工作线程向所述控制器发送回滚事件;以及所述控制器响应于所述回滚事件,将所述第一视频流数据返还给输入缓存器。
例如,在本公开一实施例提供的信息处理方法中,还包括:响应于所述编解码器内部处理异常,所述工作线程请求所述控制器将所述工作线程标记为僵尸线程。
例如,在本公开一实施例提供的信息处理方法中,还包括:响应于所述编解码器对所述第一视频流数据处理之后出现异常,所述工作线程向所述控制器发送丢包事件,所述控制器 向所述第一客户端发送丢包提醒。
本公开至少一个实施例提供一种信息处理方法,包括:向服务器发送初始化申请,其中,所述初始化申请包括支持播放的视频流数据的播放参数;在所述服务器响应所述初始化申请之后,向所述服务器发送任务指令;以及接收所述服务器提供的第二视频流数据,并且播放所述第二视频流数据,其中,所述第二视频流数据为所述服务器对根据所述任务指令获取的第一视频流数据进行转换得到。
例如,在本公开一实施例提供的信息处理方法中,接收所述服务器提供的第二视频流数据,并且播放所述第二视频流数据,包括:接收所述服务器提供的所述第二视频流数据;对所述第二视频流数据进行解析,获取所述第二视频流数据中的播放基础信息;以及根据所述播放基础信息播放所述第二视频流数据。
本公开至少一个实施例提供一种信息处理装置,包括:第一接收单元,配置为接收第一客户端的第一初始申请并且响应所述第一初始申请,其中,所述第一初始申请包括所述第一客户端支持播放的视频流数据的播放参数;第二接收单元,配置为在响应所述第一初始申请之后,接收所述第一客户端的任务指令;指令获取单元,配置为根据所述任务指令,获取第一视频流数据;转换单元,配置为将所述第一视频流数据转换为第二视频流数据,使得所述第二视频流数据具有所述播放参数;以及提供单元,配置为向所述第一客户端提供所述第二视频流数据以便所述第二视频流数据在所述第一客户端播放。
本公开至少一个实施例提供一种信息处理装置,包括:申请发送单元,配置为向服务器发送初始化申请,其中,所述初始化申请包括支持播放的视频流数据的播放参数;指令发送单元,配置为在所述服务器响应所述初始化申请之后,向所述服务器发送任务指令;以及播放单元,配置为接收所述服务器提供的第二视频流数据,并且播放所述第二视频流数据,其中,所述第二视频流数据为所述服务器对根据所述任务指令获取的第一视频流数据进行转换得到。
本公开至少一个实施例提供一种电子设备,包括处理器;存储器,包括一个或多个计算机程序指令;其中,所述一个或多个计算机程序指令被存储在所述存储器中,并由所述处理器执行时实现本公开任一实施例提供的信息处理方法的指令。
本公开至少一个实施例提供一种计算机可读存储介质,非暂时性存储有计算机可读指令,其中,当所述计算机可读指令由处理器执行时实现本公开任一实施例提供的信息处理方法。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例的附图作简单地介绍,显而易见地,下面描述中的附图仅仅涉及本公开的一些实施例,而非对本公开的限制。
图1示出了本公开至少一实施例提供的一种应用于信息处理方法的系统架构100;
图2示出了本公开至少一实施例提供的一种信息处理方法的流程图;
图3A示出了本公开至少一个实施例提供的一种图2中S50的方法流程图;
图3B示出了本公开至少一个实施例提供的一种多个编码层的示例图;
图4示出了本公开至少一个实施例提供的另一种信息处理方法流程图;
图5示出了本公开至少一个实施例提供的另一种信息处理方法流程图;
图6示出了本公开至少一个实施例提供的另一种信息处理方法流程图;
图7示出了本公开至少一个实施例提供的另一种信息处理方法流程图;
图8示出了本公开至少一个实施例提供的一种应用信息处理方法的系统架构800的示意图;
图9示出了本公开至少一个实施例提供的一种应用信息处理方法的服务器架构900的示意图;
图10示出了本公开至少一个实施例提供的一种信息处理装置1000的示意框图;
图11示出了本公开至少一个实施例提供的另一种信息处理装置1000的示意框图;
图12示出了本公开至少一个实施例提供的一种电子设备的示意框图;
图13示出了本公开至少一个实施例提供的另一种电子设备的示意框图;以及
图14示出了本公开至少一个实施例提供的一种计算机可读存储介质的示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例的附图,对本公开实施例的技术方案进行清楚、完整地描述。显然,所描述的实施例是本公开的一部分实施例,而不是全部的实施例。基于所描述的本公开的实施例,本领域普通技术人员在无需创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。
除非另外定义,本公开使用的技术术语或者科学术语应当为本公开所属领域内具有一般技能的人士所理解的通常意义。本公开中使用的“第一”、“第二”以及类似的词语并不表示任何顺序、数量或者重要性,而只是用来区分不同的组成部分。同样,“一个”、“一”或者“该”等类似词语也不表示数量限制,而是表示存在至少一个。“包括”或者“包含”等类似的词语意指出现该词前面的元件或者物件涵盖出现在该词后面列举的元件或者物件及其等同,而不排除其他元件或者物件。“连接”或者“相连”等类似的词语并非限定于物理的或者机械的连接,而是可以包括电性的连接,不管是直接的还是间接的。“上”、“下”、“左”、“右”等仅用于表示相对位置关系,当被描述对象的绝对位置改变后,则该相对位置关系也可能相应地改变。
流媒体技术发展到今天,编解码技术已经是五花八门。如最广泛被使用的H264,压缩比率较高的H265,性价比高的av1,以及MPEG4等。音频方面,有还原度较好的pcm,易用的wav,以及常见的mp3等。每一种编解码算法都有其独特的特性,用于特定的场景。而对于不同的音视频播放设备,其封装格式又衍生出多种格式,如avi,mov,mkv,flv,普及度最广的mp4等等。
由于编解码技术的五花八门,用户可能会面临一个选择困难的尴尬的局面。因为,用于并不了解使用的终端或者播放器能播放什么格式的视频。通过,终端软件或硬件的开发者需要将所有情况都枚举出来,做归一化处理,即便是用户的终端能识别并解码这些音视频,其终端的cpu、显卡或者带宽的能力也未必允许他播放这些媒体流/文件。
例如,园区私有网络下,管理员想在网页上查看连接到本网络下的某个摄像头抓取的H264视频流,出于安全因素,各大浏览器供应商都禁止实时流传输协议(Real Time Streaming Protocol,rtsp)在网页上的数据包传输,因此,由于浏览器无法使用实时流传输协议,导致该管理员无法通过网页查看H264视频流。
又例如,在上述场景中,摄像头出来的视频流是H265视频流,但是管理员的终端不支持H265视频(例如,终端中的解码器无法对H265视频进行解码)的播放而导致管理员无法利用终端查看H265视频流。
又例如,很多可穿戴设备由于工艺设计上要求轻便小巧导致采用单片机来设计,而没有CPU和GPU。因此,可穿戴设备的解码算力较弱无法对摄像头视频进行解码而导致难以利用该可穿戴设备来迅速查看某个方位下的摄像头视频。
又例如,在网络较差的环境下(即,弱网环境),终端容易产生丢包,导致终端播放的音视频质量较差,影响观看体验。
本公开提供了另一种信息处理方法,以解决例如上述示例中浏览器无法使用协议、终端缺少对应的解码器、弱解码算力以及弱网环境等场景下播放视频困难的问题。
本公开至少一个实施例提供一种信息处理方法、另一种信息处理方法、信息处理装置、另一种信息处理装置、电子设备和计算机可读存储介质。该信息处理方法包括:接收第一客户端的第一初始申请并且响应所述第一初始申请,其中,所述第一初始申请包括所述第一客户端支持播放的视频流数据的播放参数;在响应所述第一初始申请之后,接收所述第一客户端的任务指令;根据所述任务指令,获取第一视频流数据;将所述第一视频流数据转换为第二视频流数据,使得所述第二视频流数据具有所述播放参数;以及向所述第一客户端提供所述第二视频流数据以便所述第二视频流数据在所述第一客户端播放。该信息处理方法可以环境复杂编解码环境下的音视频播放时面临播放难的问题。
图1示出了本公开至少一实施例提供的一种应用于信息处理方法的系统架构100。
如图1所示,该系统架构100可以包括终端设备101、服务器102和通信网络103。
用户可以使用终端设备101通过通信网络103与服务器102交互,以接收或发送消息。通信网络103用以在终端设备101和服务器102之间提供通信链路的介质。通信网络103可以包括各种连接类型,例如有线或无线通信链路,具体地例如WIFI、3G、4G、5G和光纤电缆等。
终端设备101可以为具有音频和/或图像播放功能的各种电子设备,包括但不限于智能手机、平板电脑、笔记本电脑等,终端设备101还可以是单片机,soc,浏览器或者自定义播放器等。本公开实施例不限定终端设备101的产品类型,并且例如该终端设备是可以基 于各种可用的操作系统,例如Windows、Android、IOS等。终端设备101中可以安装有各种应用程序(APP),例如,音视频播放应用、购物类应用、网页浏览器应用、即时通讯工具等,或者可以通过应用程序平台(例如微信、支付宝等)下载并运行小程序、快应用等。例如,用户可以利用终端设备101中的音视频播放应用播放音乐或者视频。
服务器102可以是执行下文图2所示的信息处理方法的服务器。服务器102可以为独立的物理服务器,也可以为多个物理服务器构成的服务器集群或者分布式系统,也可以是云服务器等。服务器102例如可以是建立在centos、debian、freebsd等这种带操作系统的服务器上,这样对例如ffmpeg等编解码器有较好的支持,如果服务器102涉及芯片,则需另行开发代替ffmpeg的软件工具包。
图2示出了本公开至少一实施例提供的一种信息处理方法的流程图。
如图2所示,该方法可以包括步骤S10~S50。
步骤S10:接收第一客户端的第一初始申请并且响应第一初始申请,第一初始申请包括第一客户端支持播放的视频流数据的播放参数。
步骤S20:在响应第一初始申请之后,接收第一客户端的任务指令。
步骤S30:根据任务指令,获取第一视频流数据。
步骤S40:将第一视频流数据转换为第二视频流数据,使得第二视频流数据具有播放参数。
步骤S50:向第一客户端提供第二视频流数据以便第二视频流数据在第一客户端播放。
在本公开的实施例中,图2所示的信息处理方法可以由图1中的服务器102执行。例如,由服务器102对第一客户端要播放的音视频进行编解码,从而利用云端(即,服务器)强大的软/硬件提供的算力来满足大部分终端的音视频播放需求,实现了一种云端播放器,解决了环境复杂编解码环境下的音视频播放时面临播放难的问题。
对于步骤S10,第一客户端例如可以是图1中的终端设备101安装的音视频播放应用。
例如,第一客户端向服务器102发送第一初始申请,第一初始申请包括第一客户端支持播放的视频流数据的播放参数。播放参数例如为第一客户端(例如,第一客户端的浏览器播放工具包)支持播放(高清或实时)的具体参数要求。视频流数据例如是能够被音视频播放器读取识别并且播放的数据流。
在本公开的一些实施例中,第一初始申请还包括指令类型。在指令类型为数据指令时,播放参数可以是编码参数。编码参数例如包括封装格式、编解码格式、编码级别等。在指令类型为播放器指令时,播放参数可以包括播放速率、音视频跳转到的播放位置等。指令类型还可以包括编解码指令。播放参数例如可以包括ffmpeg的avoption参数,即用于配置解码策略和编码规则的。这里简单解释下,MP4封装格式通常由三个box组成,一个ftype记录文件类型,一个moov记录播放基础信息如帧率等,一个mdata存放实际媒体数据。而fmp4算是MPEG4中对mp4直播能力的支持,fmp4跟MP4类似,但是不需要moov box,而是将基础信息放在了一个一个的moof包中,由ftyp+moof+mdata+moof+mdata这种模式 形成一种流式媒体数据。例如,播放参数包括播放时间参数frag_duration用于表示一组moof+mdata的播放时间。播放参数包括编码级别和画质水平,播放参数还可以包括编码格式等。
如下表一本公开至少一个实施例提供的一种客户端指令集。第一初始化申请和任务指令可以按照表一所示的客户端指令集的格式生成。
如表一所示,若一个指令的首字节为0,表示该指令为数据指令,如果该指令的次字节为0,则该指令为初始化申请指令,在初始化申请指令中包括json格式字符串,json格式字符串包括一些播放参数。例如封装格式为fmp4、编解码格式为h264、编码级别为42等。若一个指令的首字节为1,表示该指令为播放器指令,如果该播放器指令的次字节为1,则该指令为开始播放指令。
表一
Figure PCTCN2022120543-appb-000001
Figure PCTCN2022120543-appb-000002
例如,图1中的服务器102接收终端设备101发送的第一初始申请,并且响应该第一初始申请。
在本公开的一些实施例中,步骤S10可以包括:接收第一客户端的第一初始申请;响应第一初始申请,创建工作线程,并且为工作线程开启输入缓存区和输出缓存区。输入缓存区配置为存储第一视频流数据。工作线程配置为从输入缓存区获取第一视频流数据并且对第一视频流数据进行处理得到第二视频流数据。输出缓存区配置为接收工作线程提供的第二视频流数据,并且向第一客户端提供第二视频流数据。
例如,响应第一初始申请,启动工作线程,并且开启两片堆内存,共享给该工作线程。两片堆内存中的一片作为输入缓存区,另一片作为输出缓存区。例如,响应初始申请还可以包括工作线程根据初始化数据初始化编解码器(例如,ffmpeg)工作脚本,创建输出和中继处理的单元后等待堆内存读数据,后续该工作线程会不断尝试从该堆内存中读取数据,自此初始化流程完毕。
在本公开的一些实施例中,服务器包括控制器,由控制器启动工作线程,并且开启两片堆内存,共享给该工作线程。
对于步骤S20,例如第一客户端在接收到来自服务器的初始化申请完成的信息之后,向服务器发送任务指令。
任务指令例如可以是请求播放视频的指令。任务指令可以包括第一视频数据流的获取方式、第一视频数据流的字节长度等。
例如,在图1所示的架构中,服务器102接收来自终端设备的任务指令。
对于步骤S30,第一视频流数据例如是园区的摄像头拍摄的视频或者图片。例如,园区监控室内的大屏端作为第一客户端请求播放摄像头拍摄的视频流。
第一视频流数据例如也可以是来自于第三方,例如用户直播的视频流。
对于步骤S30,例如根据任务指令,确定获取第一视频流数据的方式,以及按照方式获取第一视频流数据。
在本公开的一些实施例中,根据任务指令,确定获取第一视频流数据的获取方式,以及按照方式获取第一视频流数据,包括:响应于任务指令指示获取方式为统一资源定位符方式,从任务指令中获取统一资源定位符;根据统一资源定位符获取第一视频流数据;以及将第一视频流数据存储到输入缓存区。
该实施例直接把统一资源定位符(uniform resource locator,url)代理给编解码器软件开发工具包(例如,ffmpeg sdk),ffmpeg内部有一系列工具箱会处理该url。例如服务器访问url得到第一视频流数据。利用url获取第一视频流数据的方法简单直接,易于实施。
在本公开的另一些实施例中,根据任务指令,确定获取第一视频流数据的获取方式,以及按照方式获取第一视频流数据,包括:响应于任务指令指示获取方式为缓冲器方式,从任务指令中提取第一视频流数据;以及将第一视频流数据存储到输入缓存区。
该实施例的适用范围较广,灵活性高,适用于云端播放器(即,服务器)无法直接访问多媒体源的情况。例如,针对云端播放器无法直接访问多媒体源的情况,第一客户端自行把音视频数据封装发送给云端。例如,某个第一客户端查看摄像头抓取的H264视频流,但是云端播放器无法访问摄像头抓取的H264视频流,则第一客户端自行将音视频数据封装,从而得到buffer字节流,并且向云端播放器发送buffer字节流。在本公开的一些实施例中,音视频数据可以直接封装在任务指令中,从而云端播放器直接读取任务指令中的buffer字节流来得到第一视频流数据,以及将第一视频流数据存储到输入缓存区。
在本公开的一些实施例中,例如工作线程在得到该类型的任务指令后,将数据读入内部缓存,自行实现CUSTOM_IO的读入方法。CUSTOM_IO为ffmpeg的输入输出处理器,用于用户自定义读入的执行过程。
在本公开的另一些实施例中,根据任务指令,确定获取第一视频流数据的获取方式,以及按照方式获取第一视频流数据,包括:响应于任务指令指示获取方式为碎片化字节流方式,依次接收来自第一客户端提供的多个任务子指令,多个任务子指令分别包括第一视频流数据中的不同部分;以及依次从多个任务子指令中提取第一视频流数据中的部分视频流数据,并且将部分视频流数据依次存储到输入缓存区。
该实施例第一客户端将第一视频流数据进行碎片化,即将第一视频流数据拆分为多个任务子指令,并且持续地向云端播放器提供多个任务子指令,每个任务子指令中包括碎片化 后的第一视频流数据中的不同碎片也即不同部分。云端播放器从每个任务子指令中获取第一视频流数据的碎片,并且将碎片添加的输入缓存区中。该实施例需要在第一次初始化时提供保留内存空间大小,否则以默认大小(例如,1G)为准生成保留内存空间,此后第一客户端会不断地往这段内存中写数据,数据可以自带序列号。碎片化的视频流数据的传输实时性较好,更加适用于直播形式。
如表一所示,例如,若数据指令的次字节为1,则表示第一视频流数据的获取方式为url直接拉取。例如,若数据指令的次字节为2,则表示客户端提供字节流,即缓冲器方式。例如,若数据指令的次字节为3,则表示客户端依次提供部分字节流,即碎片化字节流方式。
步骤S40可以包括工作线程从输入缓存区中读取第一视频流数据并且将第一视频流数据存储到工作线程的处理队列中;以及根据播放参数,工作线程对处理队列中的第一视频流数据进行解码和编码得到第二视频流数据。
当一个工作线程从输入缓存中取数据并进行处理时,该数据会被放到另一个处理队列中。这个处理队列专门存放待处理数据直到该工作线程释放读内存锁,并弹出读完毕事件。工作线程不断地从该输入缓存中读取序列号最小的数据,读取后擦拭掉输入缓存中的该数据,方便后续向输入缓存中写入。
在实际情况下,由于从输入缓存区读入到处理队列的速度和输出缓存区的写出速度不同,导致输入缓冲区和输出缓存区一方可能会出现不断积压的情况。对此工作线程增加了读入和写入锁。当输入缓存区满的时候会挂起工作线程,直到输入缓存区降到一定阈值(阈值可通过初始化来配置)。输出缓存区满的时候也会挂起线程,直到输出缓存区降到一定阈值(阈值可通过初始化来配置)。工作线程挂起时,控制器会持续地往第一客户端发送状态数据。状态数据例如可以包括工作线程的状态(例如,读入锁锁定、读入锁释放、写入锁锁定、写入锁释放、系统异常等)。
对于步骤S40,例如工作线程根据初始化申请中的编解码参数去ffmpeg中寻找合适的编码器进行编码。在本公开的一些实施例中,在服务器有对硬件加速的支持下,任何常用编码器都会在数据库中找到注册对应的硬件加速版编码器。例如,h264编码器对应的硬件加速版编码器为nvenc_h264编码器;h265编码器对应的硬件加速编码器为hevc_nvenc编码器。
在本公开的一些实施例中,例如服务器可以存储编码器和硬件加速编码器的对应关系表,从而根据对应关系表查找到硬件加速编码器,以利用硬件加速编码器提高编解码速度。
例如,用户可以在客户端输入所需要的目标编码器,例如目标编码器为h264编码器等,服务器根据客户端提供的目标编码器,查找对应关系表确定是否存在与该目标编码器对应的硬件加速编码器。若存在硬件加速编码器,则利用硬件加速编码器对第一视频流数据进行编解码,以提高响应速度。
在本公开的另一些实施例中,若用户未在客户端输入指定的目标编码器,则向服务器提供默认的编码器,从而服务器,查找对应关系表确定是否存在与该默认的编码器对应的硬件 加速编码器。
在本公开的一些实施例中,根据播放参数,工作线程对处理队列中的第一视频流数据进行解码和编码得到第二视频流数据,包括:工作线程将处理队列中的第一视频流数据加载到编解码器;由编解码器对第一视频流数据进行解码和编码得到第二视频流数据。
例如,在本公开的一些实施例中,编解码器例如可以是ffmep编解码器。例如,服务器包括ffmpeg的工具包,ffmpeg工具包开启了多个编解码功能,例如硬件加速、h264编解码以及h265编解码等。ffmpeg还可以包括指令集加速和多线程的lib包。
例如,工作线程可以利用ffmpeg工具包对处理队列中的第一视频流数据进行解码和编码。
在本公开的一些实施例中,信息处理方法还可以包括响应于第一视频流数据在进入编解码器之前出现异常,工作线程向控制器发送回滚事件;以及控制器响应于回滚事件,将第一视频流数据返还给输入缓存区。
假如因为各种原因如内存溢出导致第一视频流数据进入ffmpeg之前报错的,则工作线程抛出回滚(rollback)事件,控制器会将处理队列中的该数据返还数据给输入缓存区,并且工作线程将读游标回退到上一个位置,使得状态重置。
在本公开的一些实施例中,ffmpeg内部有帧序列,导致数据不可逆,因此进入ffmpeg读入器前的数据是可以被回滚的,从解码器出来后的数据经过编码(用户未设则不编码,以raw yuv frame/pcm sample的形式发出)只能进入输出缓存区。因此,工作线程上在读输入缓冲区的数据时,都有游标位进行数据标识。
通过设置回滚事件能够使得第一视频流数据能够返还给输入缓存区,从而再次从输入缓存区中中读入正确的第一视频流数据,保证了第一视频流数据的正确性,提高编解码的正确性。
在本公开的一些实施例中,信息处理方法还可以包括:响应于编解码器内部处理异常,工作线程请求控制器将工作线程标记为僵尸线程。
例如,如果因为各种原因如包顺序混乱导致在ffmpeg解/编码器内部报错的,则工作线程会抛出自杀事件请控制器将自己标识为僵尸线程。
当控制器将某线程标记为僵尸线程,则控制器会杀死该线程后,将该线程在输入缓存区,处理队列和输出缓存区中所有的相关数据清除掉。通知销毁事件给终端后,清除键值对,断开连接并销毁socket连接句柄。
在本公开的一些实施例中,信息处理方法还可以包括:响应于编解码器对第一视频流数据处理之后出现异常,工作线程向控制器发送丢包事件,控制器向第一客户端发送丢包提醒。
例如,如果因为各种原因如系统异常导致在出ffmpeg编码器之后报错的,则工作线程会抛出丢包事件,请控制器将该事件通知给终端。该实施例能够及时地通过第一客户端出现丢包,便于第一客户端及时地处理丢包事件,以保证第一视频流数据的可达到性和正确性。
图3A示出了本公开至少一个实施例提供的一种图2中S50的方法流程图。
如图3A所示,步骤S50可以包括步骤S51~S54。
步骤S51:根据工作线程对第一视频流数据执行的编码操作,生成编码信息。
步骤S52:将编码信息存储于输出缓存区的队首,使得输出缓存区输出的首个数据包包含编码信息。
步骤S53:向输出缓存区中写入第二视频流数据。
步骤S54:从输出缓存区依次向第一客户端提供多个数据包,多个数据包包括编码信息和第二视频流数据。
对于步骤S51,工作线程在对第一视频流数据解码复用(demux)后,得到编码信息,例如编码级别(profile),最大分辨率(level),序列参数集(Sequence Paramater Set,SPS)、图像参数集(Picture Paramater Set,PPS)constrain_set_flag(sps,pps信息)。
对于步骤S52,将编码信息保存在输出缓存区的队首,且工作线程销毁前该编码信息不能被擦拭。这样任何一个客户端请求访问第二视频流数据都会拿到该编码信息,这样可以在客户端本地进行解码的初始化准备或者优化。因此,在该实施例中,云端播放器向第一客户端或者其他客户端返回的消息的首个数据包包括编码信息,在本公开中将包含该编码信息的包称之为codec包。
对于步骤S53,例如工作线程不断地对碎片化的第一视频流数据进行编解码,并且不断地向输出缓存区中写入编解码之后的第二视频流数据。
在本公开的一些实施例中,步骤S53可以包括:将第二视频流数据中的关键帧的播放基础信息封装为混淆包使得第一客户端对混淆包进行解析得到播放基础信息,混淆包包括多个随机生成的字节。
将第二视频流数据中的关键帧的播放基础信息封装为混淆包能够提高下行数据的安全性,即任何第三方在不向云端请求的情况下无法对该视频流进行解码。本公开的混淆包采用混编模式,即,如果是包含关键帧的包,则将播放基础信息封装为混淆包(wild card,WC),否则正常发送。WC包包含了必要的数据信息,需要客户端配合进行I帧生成。由于关键帧的特殊性,其他参考帧如果缺失了关键帧都会花屏。
在本公开的一些实施例中,由于混淆包需要客户端配合解析得到播放基础信息,因此将第二视频流数据中的关键帧的播放基础信息封装为混淆包可以作为一个可拔插的插件的形式而存在,也就是可配置。
Wild card box是一种混淆包,和其他mp4box不同,首4字节为字符串标识位wc,次8字节为token,这里的token是socket建立时得到的8字节unsigned int token,后8字节为随机长度值,紧随其后为随机长度的具体字节数,里面存放很多随机生成的字节用来混淆用。随后的是真正的moof包具体的子盒子mfhd等。不过,客户端需自行构造moof,track fragment等容器盒子(traf)。
在本公开的另一些实施例中,例如对于fmp4,将基础信息放在了一个一个的moof包 中,由ftyp+moof+mdata+moof+mdata这种模式形成一种流式媒体数据。因此,第二视频流数据中的关键帧的播放基础信息可以封装为moof包。
对于步骤S54,例如,从输出缓存区依次向第一客户端提供多个数据包,多个数据包包括codec包和第二视频流数据对应的多个数据包。
在本公开的一些实施例中,工作线程包括读入锁,读入锁在输入缓存区存满时触发,响应于读入锁被触发,工作线程不再从输入缓存区中读取第一视频流数据。
在本公开的一些实施例中,工作线程还包括写入锁,写入锁在输出缓存区存满时触发,响应于写入锁被触发,工作线程不再向输出缓存区中写入第二视频流数据,并且工作线程不再从输入缓存区中读取第一视频流数据。
例如,读入锁在输入缓存区满时触发,不再读入第一视频流数据,客户端会收到服务器提供的读入锁的信息。写入锁在输出缓存区满时触发,工作线程中的编解码器挂起,不再写出,同时也不再读入第一视频流数据到输入缓冲区。
通过设置读入锁和写入锁不仅能够起到隔离行为的作用,还能够保证到云端的视频流以最快的速度被编解码并发送。
在本公开的一些实施例中,信息处理方法还包括获取输入缓存区读入第一视频流数据的读速度Vr;获取工作线程将第一视频流数据转换为第二视频流数据的转码速度Vd;获取输出内存输出第二视频流数据的写速度Vw;以及判断转码速度Vd是否大于写速度Vw且写速度Vw是否大于读速度Vr;响应于读速度Vr、转码速度Vd和写速度Vw不满足转码速度Vd大于写速度Vw且写速度Vw大于读速度,调整读速度Vr、转码速度Vd和写速度Vw。
在本公开的一些实施例中,这里主要用于直播场景,碎片化字节流锁的判定Vr=IO Read(Websocket)KBPS/Input Stream BitRate;Vw=IO Write(Websocket)KBPS/Output Stream BitRate;以及Vd=Decoder(Worker)Read KBPS/Input Stream BitRate。IO Read(Websocket)KBPS代表websocket读入速度,Input Stream BitRate代表读入流信息头里的码率信息,IO Write(Websocket)KBPS代表websocket写出速度,Output Stream BitRate代表写出流信息头里的码率信息,Decoder(Worker)Read KBPS代表解码速度。
下文中的表二示出了本公开至少一个实施例提供的读速度Vr、转码速度Vd和写速度Vw、锁事件之间的对应关系。
表二
Figure PCTCN2022120543-appb-000003
Figure PCTCN2022120543-appb-000004
如表二所示,若Vr=3,Vd=1以及Vw=2,则写速度大于读速度,无论是读入锁还是写入锁都未被触发,转码速度也不需要降低。当读入,写出和解码时,工作线程会统计流入流出的字节,每隔几百毫秒,需要按左边的表判断读写锁的开启/释放,并重新开启统计。
在本公开的实施例中,读速度、转码速度和写速度的顺序为Vd>Vw>Vr,这样能够保证到云端播放器的视频流以最快的速度被编解码并发送。否则一端的数据则会被积压,导致锁现象发生。如果发生了读入锁,说明写出过程或者解码过程相对慢一些。如果主要原因是解码器的问题,解码器可以通过动态修改工作线程的编解码速率(即,码率)以缓解或解决问题。
在本公开的一些实施例中,转码速度被划分为转码速度依次降低的多个编码层,响应于读速度、转码速度和写速度不满足转码速度大于写速度且写速度大于读速度,调整读速度、转码速度和写速度包括:响应于转码速度小于写速度,从第一视频流数据的下一个关键帧开始,转码码率调整为转码码率所在的编码层的下一个编码层对应的转码码率。例如,若将转码码率下调一个编码层之后,转码速度仍然低于写速度,则再下调一个编码层,即在该实施例中可以逐级递减转码码率,从而提高转码速度,更加流畅。
对于写速度低于读速度的情况,则返回读锁的异常事件并不再接受新第一视频流(读入流)数据的包。这种情况下第一客户端接收到了读入锁异常事件有必要本地缓存或者丢弃掉即将要发送的数据包。
在本公开的一些实施例中,动态修改工作线程的码率可以由初始化参数配置开启。例如,初始化时会以当前读入流码率为原点,向下准备N个编码层,每层码率=原码率×4 -N。当输入缓冲区满时,读锁计数器(RC)+1,当RC>某阈值时。则以下一个关键帧为开始,向下调整一层码率,并重置RC,也就切换到了下一层,并重新生成输出流。
图3B示出了本公开至少一个实施例提供的一种多个编码层的示例图。
如图3B所示,例如在工作线程初始化时可以准备4个编码层,分别为码率为1024kbps的编码层、码率为256kbps的编码层、码率为64kbps的编码层和码率为16kbps的编码层。
例如,在图3B中工作线程将第二视频流信息编码为YUV420格式的码率可以从1024kbps、256kbps、64kbps和16kbps中选择。当输入缓冲区较满时,则可以由码率1024kbps调整为码率256kbps。
图4示出了本公开至少一个实施例提供的另一种信息处理方法流程图。
如图4所示,该信息处理方法除包括图2所示的步骤S10~S50之外,还可以包括步骤S60和步骤S70。步骤S60和步骤S70例如可以在步骤S50之后执行。
步骤S60:获取第一客户端的播放器指令。
步骤S70:根据播放器指令,控制第二视频流数据在第一客户端的播放状态。
该方法使得用户可以通过第一客户端实现对播放的第二视频流数据的控制,例如控制第二视频流数据的播放速度、暂停播放、开始播放等,从而进一步完善了云端播放器的功能。
对于步骤S60,例如播放器指令包括以下至少一种:暂停播放指令、开始播放指令、倍速播放指令、重置为初始状态指令、跳转指令。
暂停播放指令用于控制第二视频流数据暂停播放,开始播放指令用于控制第二视频流数据开始播放,倍速播放指令用于调整第二视频流数据的播放速度,重置为初始状态指令用于控制第二视频流数据回到初始时刻,跳转指令用于控制第二视频流数据跳转至的视频播放位置,以从该视频播放位置开始播放。
在本公开的实施例中,例如播放器指令包括多个字节,首字节为1表示该指令为播放器指令。在首字节为1的情况下,次字节表示具体地播放指令。例如,次字节为0表示播放器指令为重置为初始状态、次字节为1表示播放器指令为开始播放、次字节为2表示播放器指令为暂停等。本领域技术人员可以自行设置次字节的值所代表了播放器指令。
对于步骤S70,例如服务器响应该播放器指令,控制第二视频流数据在第一客户端的播放速度、暂停播放、开始播放等状态。
图5示出了本公开至少一个实施例提供的另一种信息处理方法流程图。
如图5所示,该信息处理方法除包括图2所示的步骤S10~S50之外,还可以包括步骤S80和步骤S90。步骤S80例如可以在步骤S10之前执行。
步骤S80:在接收第一客户端的第一初始申请之前,与第一客户端建立双向数据通道。
步骤S90:在响应第一初始申请之后,建立双向数据通道、输出缓存区、输入缓存区、验证信息和工作线程的对应关系,以便根据对应关系与第一客户端交互。
对于步骤S80,双向数据通道例如可以是基于Websocket的数据通道。第一客户端和服务器通过Websocket进行双向通信。
在本公开的一些实施例中,考虑到网络到达的不确定性,且websocket是tcp,在高并发情况下tcp顺序可能会错乱,本公开的实施例增加类似tcp握手的ACK机制。即,每次指令发送时,会以微秒时间戳生成一个唯一的PING数据(默认在线使用者不会超过1000000人,如有溢出,则可以考虑以其他方法如雪花算法生成PING数据),控制器接收会负责在原指令上返回相同值的PONG作为应答,终端可以根据PONG的接受状态确认数据是否到 达,如果一段时间内没有PONG可以考虑重发数据包,云端会接受判断当前指针是否指向该数据包,如果尚未到达此包,则写覆盖,如果正在读入工作线程或者已经读到后续数据包时,则数据废弃。
对于步骤S90,例如,当客户端连接时,控制器会拿到连接句柄,当客户端发送初始化申请后,控制器会初始化内存开启输入缓存区和输出缓存区,并且创建工作线程,生成验证信息token,并在工作线程中将双向数据通道、输出缓存区、输入缓存区、验证信息和工作线程生成对应关系,该对应关系例如以键值对的形式存储。
在本公开的一些实施例中,信息处理方法还可以包括:通过控制器对工作线程的读入锁和写入锁进行监听,获取读入锁事件和写入锁事件;以及通过控制器向第一客户端提供读入锁事件和写入锁事件。
例如,控制器同时监听各个线程锁事件,根据事件将通知传递给各终端,并根据锁的情况暂时将读入包/写出包挂起,以便整个系统健康运作。另外,控制器还会定时发送消息给工作线程,响应于控制器在预设时间段内未接收到工作线程针对消息的响应,清除工作线程。例如,对于预设时间段内(例如,1分钟内)没有回执的线程会被标记为僵尸线程,控制器为了整个系统健康,会将僵尸线程安全清除掉,并通知相关socket句柄,释放资源。
图6示出了本公开至少一个实施例提供的另一种信息处理方法流程图。
如图6所示,该信息处理方法除包括图2所示的步骤S10~S50之外,还可以包括步骤S601和步骤S602。步骤S601和步骤S602例如可以在步骤S50之后执行。
步骤S601:响应于接收到第二客户端提供的第二初始化申请,从第二初始化申请中获取待验证信息。
步骤S602:响应于待验证信息与验证信息一致,根据对应关系获取与验证信息对应的输出缓存区,从输出缓存区向第二客户端提供第二视频流数据。
对于步骤S601,如图1所示,在该系统架构100中,除包括第一客户端101之外,还可以包括客户端104。客户端104为第二客户端的示例。
例如,客户端101在第一次向服务器发送第一初始化申请之后,服务器返回第一个消息体里会有令牌。客户端101可以向其他的用户(例如,客户端104)分享该令牌,使得客户端104根据令牌生成第二初始化申请,并且向服务器发送该第二初始化申请。
例如,令牌可以被共享给其他用户,这样其他用户可以通过该令牌访问服务端发送的视频流,相当于组播视频,如果访问令牌过多则会保留最大组播数个连接,拒绝服务其他令牌持有人。任务发起者拥有对该任务的全权处理权。
服务器响应于接收到客户端104提供的第二初始化申请,从第二初始化申请中获取待验证信息。
对于步骤S602,服务器对待验证信息,响应于待验证信息与验证信息一致,根据对应关系获取与验证信息对应的输出缓存区,从输出缓存区向第二客户端提供第二视频流数据。
下文中的表三示出了本公开至少一个实施例提供的一种服务器的数据集。服务器向客 户端提供的数据可以按照表三的格式生成。例如,在表三的示例中,若服务器向客户端返回的数据的首字节为1,则该数据为PONG数据;若该PONG数据中的1字节状态位为01,则表示第一视频流数据已经进入输入缓冲区。又例如,若服务器向客户端返回的数据的首字节为0,则表示该数据为数据流。若该数据流的次字节为1,则该数据流为事件流。若该事件流的1字节三级分类为02,表示云端播放器被暂停,若1字节三级分类为E0,则表示触发写入锁。
表三
Figure PCTCN2022120543-appb-000005
Figure PCTCN2022120543-appb-000006
图7示出了本公开至少一个实施例提供的另一种信息处理方法流程图。
该信息处理方法例如由图1中的客户端101或者客户端104执行。
如图7所示,该信息处理方法可以包括步骤:S710~S730。
步骤S710:向服务器发送初始化申请,初始化申请包括支持播放的视频流数据的播放参数。
步骤S720:在服务器响应初始化申请之后,向服务器发送任务指令。
步骤S730:接收服务器提供的第二视频流数据,并且播放第二视频流数据,第二视频流数据为服务器对根据任务指令获取的第一视频流数据进行转换得到。
该信息处理方法由服务器对第一客户端要播放的音视频进行编解码,从而利用云端(即,服务器)强大的软/硬件提供的算力来满足大部分终端的音视频播放需求,实现了一种云端播放器,解决了环境复杂编解码环境下的音视频播放时面临播放难的问题。
对于步骤S710,例如客户端101向服务器发送初始化申请。该初始化申请例如可以是上文描述的第一初始化申请。初始化申请例如是根据上文表一的指令集生成的。
又例如,客户端104在接收到客户端101分享的令牌之后也可以向服务器发送初始化申请。
对于步骤S720,例如在接收到服务器提供的响应之后,向服务器发送任务指令。
任务指令例如可以是请求服务器提供第二视频流数据。步骤S720例如可以是客户端101或者客户端104拉流。
服务器响应于接收任务指令,例如执行上文步骤S30~S50,以对根据任务指令获取的第一视频流数据转换得到第二视频流数据。
对于步骤S730,例如客户端101接收第二视频流数据,并且播放第二视频流数据。
在本公开的一些实施例中,步骤S730可以包括接收服务器提供的第二视频流数据;对第二视频流数据进行解析,获取第二视频流数据中的播放基础信息;以及根据播放基础信息播放第二视频流数据。
例如,第二视频流数据包括混淆包。该混淆包首4字节为字符串标识位wc,次8字节为token,这里的token是socket建立时得到的8字节unsigned int token,后8字节为随机长度值,紧随其后为随机长度的具体字节数,里面存放很多随机生成的字节用来混淆用。随后的是真正的moof包具体的子盒子mfhd等,从而提高安全性。
客户端需自行构造moof,track fragment等容器盒子(traf)。例如,令mfhd之后的字节为Buf,将Buf的长度为traf的字节数,记作BufLen。将utf8字符串”mfhd”转为byte array, 拼接上小端方式生成字节的BufLen,再拼上Buf,这一大段字节数组记作TrafBuf,将TrafBuf长度基座TrafBufLen,traf构造结束。
将Mfhd字节记作MfhdBuf,长度记作MfhdBufLen。现在开始构造moof box。令MoofBufLen=MfHdBufLen+TrafBufLen;将utf字符串“moof”转为byte array,拼接上小端生成的字节MoofBufLen,拼接上MfhdBuf,再拼接上TrafBuf,即得完整的Moof包。
例如,按MSE标准的API操作即可,即初始化一个Media Source,按codec包创建SourceBuffer。此后不断地将moof和mdata喂到SourceBuffer中即可得到第二视频流数据。
图8示出了本公开至少一个实施例提供的一种应用信息处理方法的系统架构800的示意图。
如图8所示,该系统架构800可以包括视频播放标签801、多媒体源缓存器802、浏览器播放器工具包803、双向通道连接804、控制器805、输入缓存区806、多个工作线程807~837和输出缓存区808。视频播放标签801、多媒体源缓存器802、浏览器播放器工具包803为客户端一侧,控制器805、输入缓存区806、多个工作线程807~837和输出缓存区808为云端播放器一侧。
在图8的示例中,示出了多个工作线程807~837,例如多个工作线程807~837是分别为不同的客户端创建的,下文以工作线程807为例来说明实施方式。
客户端和云端播放器通过双向通道连接804通信,双向通道连接804例如可以是websocket。
浏览器播放器工具包803例如为浏览器工具包(Browser Player SDK),配置为负值收集第一视频流数据,并且依照规定格式(例如,传送流TS、节目流ps等)通过websocket发送给云端播放器。
云端播放器的控制器805响应于接收到第一视频流数据,并且将第一视频流数据发送给提前创建的输入缓存区806。
控制器805负责承上启下。当socket客户端连接时,控制器805会拿到连接句柄,当socket客户端发送初始化申请后,控制器805会初始化内存,创建工作线程。生成token,并在主线程中将这三者生成键值对。同时监听各个线程锁事件,根据事件将通知传递给各终端,并根据锁的情况暂时将读入包/写出包挂起,以便整个系统健康运作。另外,控制器805还会定时发送消息给工作线程,例如预设时间段内没有回执的线程会被标记为僵尸线程,控制器为了整个系统健康,会将僵尸线程安全清除掉,并通知相关socket句柄,释放资源。总的来说,控制器805相当于一个管家或监工的职责。
工作线程807从输入缓存区806中获取第一视频流数据,并且对第一视频流数据进行编解码得到第二视频流数据。工作线程807将第二视频流数据依次发送给输出缓存区808。
在本公开的一些实施例中,例如工作线程807包括ffmpeg工具包。当用户初始化之后,ffmpeg初始化数据完成,等待事件指示。工作线程807这时会等待实际任务指令,响应于任务指令,控制器将任务指令(例如00 01 3E 72 00 74 00 73 00 70 00(其中,00表示数据指 令,01表示url方式,url长度62位,后面是rtsp://.....)写入读内存,发送事件通知工作线程807。Woker接受事件后开始打开读入流,同时根据初始化参数生成目标写出流,将custom IO的context(pb)挂在读出流上,即生成了一个流出数据的管道。管道经过fmp4切片器切片后,形成了一个个可被发送给客户端的数据包。当写入锁没有锁的时候,数据包即被压入写出缓存区808,供websocket服务处理。需要注意的时,本实施例封装成fmp4仅为一种示例,本领域技术人员可以封装为实际需要的格式。封装为fmp4是能够利用浏览器的Media Source Extension(MSE)协议。这样可以方便地利用浏览器封装的裁剪版ffmpeg进行播放音视频。
输出缓存区808向控制器805发送第二视频流数据的数据包,由控制器805通过websocket向客户端提供该第二视频流数据。
例如,控制器805通过websocket向客户端提供该第二视频流数据。
客户端的浏览器播放器工具包803对CUSTO数据包进行解析,还原为fmp4,并且向多媒体源缓存器(Media Source Buffer)802提供fmp4,由多媒体源缓存器802向视频播放标签801提供该fmp4,以便通过视频播放标签801播放第二视频流数据。视频播放标签801例如为H5<video>。
因此,在图8的示例中,websocket,控制器,工作线程结合起来的这种模式在本公开中简称为SCW模式。SCW模式是websocket应用层通信的一种自定义模式,用于流式消息双向通知,线程调度管理等。
图9示出了本公开至少一个实施例提供的一种应用信息处理方法的服务器架构900的示意图。
如图9所示,该服务器架构包括控制器901、输入缓存区902、工作线程903和输出缓存区904。
例如,碎片化字节流被控制器901读入后,控制器对碎片化字节流进行协议、令牌等检查,将检查通过的碎片化字节流提供给输入缓存区902。
工作线程903的内部缓存不断地从输入缓存区902读入字节流并存入内部缓存的处理队列中。例如,内部缓存的处理队列中包括字节流4、字节流5和字节流6。
在本公开的一些实施例中,内部缓存可以是环形内存,其设置有两个指针,一个为初始指针,表示环形内存的起点,另一个为当前指针,表示当前需要读取的字节流的位置。若初始指针和当前指针之间的缓存空间能够容纳从输入缓存区读入的某个字节流,则将该字节流存入该缓存空间。
工作线程903的部分缓存从内部缓存的处理队列中读取字节流,以对字节流进行编解码处理。例如,根据播放参数、读入锁和状态,利用AVFormatCtx的avio的回调函数从部分缓存中读取数据,从而获得对象AVInputStream以及对应参数,此后利用ffmpeg工具包不断读取AVInputStream流数据,再交由ffmpeg工具包进行解码处理。
例如,利用ffmpeg工具包中的对象AVPacket Reader读取输入流,然后利用对象 SwsScale/AVFilter对输入流进行过滤或者图像缩放等处理,并且ffmpeg工具包利用Halt/Resume处理暂停、恢复等事务,以及控制编解码挂起(pending)等操作。
ffmpeg工具包编解码得到的输出流经过对象AVFormat Ctx形成输出流AVOutputStream,在利用对象PbWriter对AVOutputStream经过规整,并且经对象fmp4wrapper转换为fmp4格式之后,向输出缓存区904写入fmp4格式的视频流。ffmpeg工具包循环利用AVPacket Reader、SwsScale/AVFilter、Halt/Resume、pending等对象对每个字节流进行处理。
如图9所示,输出缓存区904的首字节为上述的包含编码信息的codec包,具体请参考上文描述。输出缓存区904在首字节之后的字节为fmp4的数据包,例如可以包括Ftype数据包和Moov数据包。Ftype用于记录文件类型,Moov用于记录播放基础信息如帧率等。
输出缓存区904将fmp4格式的视频流写入控制器,由控制器向客户端提供fmp4。
在本公开的实施例中,由于是双向数据通道,则分为握手安全,上行安全和下行安全三部分。握手部分为WSS的对称或非对称加密。下行安全,表示云端发送到客户端数据的安全。这里特指数据的不可重复利用性,即任何第三方在不向云端请求的情况下无法对该视频流进行解码。本公开的实施例为了不影响效率,使用混编模式,即,如果是包含关键帧的包的话,则将moof包封装为混淆包(wild card),否则正常发送。混淆包包含了必要的数据信息,但是格式不同,需要终端配合进行关键帧生成,格式可见附件。由于关键帧的特殊性,其他参考帧如果缺失了关键帧都会花屏。由于这部分依赖客户端,这里作为一个可拔插的插件的形式而存在,也就是可配置。
上行安全,表示客户端上传至云端数据的安全。这里特指数据的不可操作性,即任何第三方在不向终端索要令牌的情况下无法对他的数据进行访问。
在图9的示例中,控制器901还用于对工作线程903进行线程管理。
图10示出了本公开至少一个实施例提供的一种信息处理装置1000的示意框图。
例如,如图10所示,该信息处理装置1000包括第一接收单元1010、第二接收单元1020、指令获取单元1030、转换单元1040和提供单元1050。
第一接收单元1010配置为接收第一客户端的第一初始申请并且响应所述第一初始申请,第一初始申请包括所述第一客户端支持播放的视频流数据的播放参数。
第一接收单元1010例如可以执行图2描述的步骤S10。
第二接收单元1020配置为在响应所述第一初始申请之后,接收所述第一客户端的任务指令。
第二接收单元1020例如可以执行图2描述的步骤S20。
指令获取单元1030配置为根据所述任务指令,获取第一视频流数据。
指令获取单元1030例如可以执行图2描述的步骤S30。
转换单元1040配置为将所述第一视频流数据转换为第二视频流数据,使得所述第二视频流数据具有所述播放参数。
转换单元1040例如可以执行图2描述的步骤S40。
提供单元1050配置为向所述第一客户端提供所述第二视频流数据以便所述第二视频流数据在所述第一客户端播放。
提供单元1050例如可以执行图2描述的步骤S50。
图11示出了本公开至少一个实施例提供的另一种信息处理装置1100的示意框图。
例如,如图11所示,该信息处理装置1100包括申请发送单元1110、指令发送单元1120和播放单元1130。
申请发送单元1110配置为向服务器发送初始化申请,其中,所述初始化申请包括支持播放的视频流数据的播放参数。
申请发送单元1110例如可以执行图7描述的步骤S710。
指令发送单元1120配置为在所述服务器响应所述初始化申请之后,向所述服务器发送任务指令。
指令发送单元1120例如可以执行图7描述的步骤S720。
播放单元1130配置为接收所述服务器提供的第二视频流数据,并且播放所述第二视频流数据,第二视频流数据为所述服务器对根据所述任务指令获取的第一视频流数据进行转换得到。
播放单元1130例如可以执行图7描述的步骤S730。
例如,第一接收单元1010、第二接收单元1020、指令获取单元1030、转换单元1040和提供单元1050、申请发送单元1110、指令发送单元1120和播放单元1130可以为硬件、软件、固件以及它们的任意可行的组合。例如,第一接收单元1010、第二接收单元1020、指令获取单元1030、转换单元1040和提供单元1050、申请发送单元1110、指令发送单元1120和播放单元1130可以为专用或通用的电路、芯片或装置等,也可以为处理器和存储器的结合。关于上述各个单元的具体实现形式,本公开的实施例对此不作限制。
需要说明的是,本公开的实施例中,信息处理装置1000或者信息处理装置1100的各个单元与前述的信息处理方法的各个步骤对应,关于信息处理装置1000或者信息处理装置1100的具体功能可以参考关于信息处理方法的相关描述,此处不再赘述。图10所示的信息处理装置1000和图11所示的信息处理装置1100的组件和结构只是示例性的,而非限制性的,根据需要,该信息处理装置1000或者信息处理装置1100还可以包括其他组件和结构。
本公开的至少一个实施例还提供了一种电子设备,该电子设备包括处理器和存储器,存储器包括一个或多个计算机程序模块。一个或多个计算机程序模块被存储在存储器中并被配置为由处理器执行,一个或多个计算机程序模块包括用于实现上述的信息处理方法的指令。该电子设备可以降由服务器对第一客户端要播放的音视频进行编解码,从而利用云端(即,服务器)强大的软/硬件提供的算力来满足大部分终端的音视频播放需求,实现了一种云端播放器,解决了环境复杂编解码环境下的音视频播放时面临播放难的问题。
图12A为本公开一些实施例提供的一种电子设备的示意框图。如图12A所示,该电子设备1200包括处理器1210和存储器1220。存储器1220用于存储非暂时性计算机可读指令 (例如一个或多个计算机程序模块)。处理器1210用于运行非暂时性计算机可读指令,非暂时性计算机可读指令被处理器1210运行时可以执行上文所述的信息处理方法中的一个或多个步骤。存储器1220和处理器1210可以通过总线系统和/或其它形式的连接机构(未示出)互连。
例如,处理器1210可以是中央处理单元(CPU)、图形处理单元(GPU)或者具有数据处理能力和/或程序执行能力的其它形式的处理单元。例如,中央处理单元(CPU)可以为X86或ARM架构等。处理器1210可以为通用处理器或专用处理器,可以控制电子设备1200中的其它组件以执行期望的功能。
例如,存储器1220可以包括一个或多个计算机程序产品的任意组合,计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。非易失性存储器例如可以包括只读存储器(ROM)、硬盘、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储器(CD-ROM)、USB存储器、闪存等。在计算机可读存储介质上可以存储一个或多个计算机程序模块,处理器1210可以运行一个或多个计算机程序模块,以实现电子设备1200的各种功能。在计算机可读存储介质中还可以存储各种应用程序和各种数据以及应用程序使用和/或产生的各种数据等。
需要说明的是,本公开的实施例中,电子设备1200的具体功能和技术效果可以参考上文中关于信息处理方法的描述,此处不再赘述。
图13为本公开一些实施例提供的另一种电子设备的示意框图。该电子设备1300例如适于用来实施本公开实施例提供的信息处理方法。电子设备1300可以是终端设备等。需要注意的是,图13示出的电子设备1300仅仅是一个示例,其不会对本公开实施例的功能和使用范围带来任何限制。
如图13所示,电子设备1300可以包括处理装置(例如中央处理器、图形处理器等)1310,其可以根据存储在只读存储器(ROM)1320中的程序或者从存储装置1380加载到随机访问存储器(RAM)1330中的程序而执行各种适当的动作和处理。在RAM 1330中,还存储有电子设备1300操作所需的各种程序和数据。处理装置1310、ROM 1320以及RAM1330通过总线1340彼此相连。输入/输出(I/O)接口1350也连接至总线1340。
通常,以下装置可以连接至I/O接口1350:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置1360;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置1370;包括例如磁带、硬盘等的存储装置1380;以及通信装置1390。通信装置1390可以允许电子设备1300与其他电子设备进行无线或有线通信以交换数据。虽然图13示出了具有各种装置的电子设备1300,但应理解的是,并不要求实施或具备所有示出的装置,电子设备1300可以替代地实施或具备更多或更少的装置。
例如,根据本公开的实施例,上述信息处理方法可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读介质上的计算 机程序,该计算机程序包括用于执行上述信息处理方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置1390从网络上被下载和安装,或者从存储装置1380安装,或者从ROM 1320安装。在该计算机程序被处理装置1310执行时,可以实现本公开实施例提供的信息处理方法中限定的功能。
本公开的至少一个实施例还提供了一种计算机可读存储介质,该计算机可读存储介质用于存储非暂时性计算机可读指令,当非暂时性计算机可读指令由计算机执行时可以实现上述的信息处理方法。利用该计算机可读存储介质,由服务器对第一客户端要播放的音视频进行编解码,从而利用云端(即,服务器)强大的软/硬件提供的算力来满足大部分终端的音视频播放需求,实现了一种云端播放器,解决了环境复杂编解码环境下的音视频播放时面临播放难的问题。
图14为本公开一些实施例提供的一种存储介质的示意图。如图14所示,存储介质1400用于存储非暂时性计算机可读指令1410。例如,当非暂时性计算机可读指令1410由计算机执行时可以执行根据上文所述的信息处理方法中的一个或多个步骤。
例如,该存储介质1400可以应用于上述电子设备1200中。例如,存储介质1400可以为图12所示的电子设备1200中的存储器1220。例如,关于存储介质1400的相关说明可以参考图12所示的电子设备1200中的存储器820的相应描述,此处不再赘述。
对于本公开,还有以下几点需要说明:
(1)本公开实施例附图只涉及到本公开实施例涉及到的结构,其他结构可参考通常设计。
(2)在不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合以得到新的实施例。
以上所述,仅为本公开的具体实施方式,但本公开的保护范围并不局限于此,本公开的保护范围应以所述权利要求的保护范围为准。

Claims (29)

  1. 一种信息处理方法,包括:
    接收第一客户端的第一初始申请并且响应所述第一初始申请,其中,所述第一初始申请包括所述第一客户端支持播放的视频流数据的播放参数;
    在响应所述第一初始申请之后,接收所述第一客户端的任务指令;
    根据所述任务指令,获取第一视频流数据;
    将所述第一视频流数据转换为第二视频流数据,使得所述第二视频流数据具有所述播放参数;以及
    向所述第一客户端提供所述第二视频流数据以便所述第二视频流数据在所述第一客户端播放。
  2. 根据权利要求1所述的方法,其中,接收所述第一客户端的所述第一初始申请并且响应所述第一初始申请包括:
    接收所述第一客户端的第一初始申请;
    响应所述第一初始申请,创建工作线程,并且为所述工作线程开启输入缓存区和输出缓存区,
    其中,所述输入缓存区配置为存储所述第一视频流数据,所述工作线程配置为从所述输入缓存区获取所述第一视频流数据并且对所述第一视频流数据进行处理得到所述第二视频流数据,所述输出缓存区配置为接收所述工作线程提供的所述第二视频流数据,并且向所述第一客户端提供所述第二视频流数据。
  3. 根据权利要求2所述的方法,其中,根据所述任务指令,获取所述第一视频流数据,包括:
    根据所述任务指令,确定获取所述第一视频流数据的方式,以及按照所述方式获取所述第一视频流数据。
  4. 根据权利要求3所述的方法,其中,根据所述任务指令,确定获取所述第一视频流数据的获取方式,以及按照所述方式获取所述第一视频流数据,包括:
    响应于所述任务指令指示所述获取方式为统一资源定位符方式,从所述任务指令中获取统一资源定位符;
    根据统一资源定位符获取所述第一视频流数据;以及
    将所述第一视频流数据存储到所述输入缓存区。
  5. 根据权利要求3所述的方法,其中,根据所述任务指令,确定获取所述第一视频流数据的获取方式,以及按照所述方式获取所述第一视频流数据,包括:
    响应于所述任务指令指示获取方式为缓冲器方式,从所述任务指令中提取所述第一视频流数据;以及
    将所述第一视频流数据存储到所述输入缓存区。
  6. 根据权利要求3所述的方法,其中,根据所述任务指令,确定获取所述第一视频流数据的获取方式,以及按照所述方式获取所述第一视频流数据,包括:
    响应于所述任务指令指示获取方式为碎片化字节流方式,依次接收来自所述第一客户端提供的多个任务子指令,其中,多个任务子指令分别包括所述第一视频流数据中的不同部分;以及
    依次从多个任务子指令中提取所述第一视频流数据中的部分视频流数据,并且将所述部分视频流数据依次存储到所述输入缓存区。
  7. 根据权利要求2~6任一项所述的方法,其中,将所述第一视频流数据转换为所述第二视频流数据,使得所述第二视频流数据具有所述播放参数,包括:
    所述工作线程从所述输入缓存区中读取所述第一视频流数据并且将所述第一视频流数据存储到所述工作线程的处理队列中;以及
    根据所述播放参数,所述工作线程对处理队列中的第一视频流数据进行解码和编码得到所述第二视频流数据。
  8. 根据权利要求2~7任一项所述的方法,其中,向所述第一客户端提供所述第二视频流数据以便所述第二视频流数据在所述第一客户端播放,包括:
    根据所述工作线程对所述第一视频流数据执行的编码操作,生成编码信息;
    将所述编码信息存储于所述输出缓存区的队首,使得所述输出缓存区输出的首个数据包包含所述编码信息;
    向所述输出缓存区中写入所述第二视频流数据;以及
    从所述输出缓存区依次向所述第一客户端提供多个数据包,其中,所述多个数据包包括所述编码信息和所述第二视频流数据。
  9. 根据权利要求8所述的方法,其中,向所述输出缓存区中写入所述第二视频流数据,包括:
    将所述第二视频流数据中的关键帧的播放基础信息封装为混淆包使得所述第一客户端对所述混淆包进行解析得到所述播放基础信息,
    其中,混淆包包括多个随机生成的字节。
  10. 根据权利要求7所述的方法,其中,所述工作线程包括读入锁,所述读入锁在所述输入缓存区存满时触发,
    响应于所述读入锁被触发,所述工作线程不再从所述输入缓存区中读取所述第一视频流数据。
  11. 根据权利要求10所述的方法,其中,所述工作线程还包括写入锁,所述写入锁在所述输出缓存区存满时触发,
    响应于所述写入锁被触发,所述工作线程不再向所述输出缓存区中写入第二视频流数据,并且所述工作线程不再从所述输入缓存区中读取所述第一视频流数据。
  12. 根据权利要求2~11任一项所述的方法,还包括:
    获取所述输入缓存区读入第一视频流数据的读速度;
    获取所述工作线程将所述第一视频流数据转换为第二视频流数据的转码速度;
    获取所述输出内存输出所述第二视频流数据的写速度;以及
    判断所述转码速度是否大于所述写速度且所述写速度是否大于所述读速度;
    响应于所述读速度、所述转码速度和所述写速度不满足所述转码速度大于所述写速度且所述写速度大于所述读速度,调整所述读速度、所述转码速度和所述写速度。
  13. 根据权利要求12所述的方法,其中,转码码率被划分为转码码率依次降低的多个编码层,
    响应于所述读速度、所述转码速度和所述写速度不满足所述转码速度是否大于所述写速度且所述写速度是否大于所述读速度,调整所述读速度、所述转码速度和所述写速度包括:
    响应于所述转码速度小于所述写速度,从所述第一视频流数据的下一个关键帧开始,所述转码码率调整为所在的编码层的下一个编码层对应的转码码率。
  14. 根据权利要求1~13任一项所述的方法,还包括:
    获取所述第一客户端的播放器指令;以及
    根据所述播放器指令,控制所述第二视频流数据在所述第一客户端的播放状态。
  15. 根据权利要求14所述的方法,其中,所述播放器指令包括以下至少一种:
    暂停播放指令、开始播放指令、倍速播放指令、重置为初始状态指令、跳转指令。
  16. 根据权利要求1~15任一项所述的方法,还包括:
    在接收所述第一客户端的第一初始申请之前,与所述第一客户端建立双向数据通道;
    在响应所述第一初始申请之后,建立所述双向数据通道、所述输出缓存区、所述输入缓存区、所述验证信息和工作线程的对应关系,以便根据所述对应关系与所述第一客户端交互。
  17. 根据权利要求16所述的方法,还包括:
    响应于接收到第二客户端提供的第二初始化申请,从所述第二初始化申请中获取待验证信息;以及
    响应于所述待验证信息与所述验证信息一致,根据所述对应关系获取与所述验证信息对应的输出缓存区,从所述输出缓存区向所述第二客户端提供所述第二视频流数据。
  18. 根据权利要求11所述的方法,还包括:
    通过控制器对所述工作线程的读入锁和写入锁进行监听,获取读入锁事件和写入锁事件;以及
    通过所述控制器向所述第一客户端提供所述读入锁事件和所述写入锁事件。
  19. 根据权利要求18所述的方法,其中,所述控制器定时向工作线程提供消息;
    所述方法还包括:
    响应于控制器在预设时间段内未接收到所述工作线程针对所述消息的响应,清除所述 工作线程。
  20. 根据权利要求7所述的方法,其中,根据所述播放参数,所述工作线程对处理队列中的第一视频流数据进行解码和编码得到所述第二视频流数据,包括:
    所述工作线程将所述处理队列中的第一视频流数据加载到编解码器;
    由所述编解码器对第一视频流数据进行解码和编码得到所述第二视频流数据。
  21. 根据权利要求20所述的方法,还包括:
    响应于所述第一视频流数据在进入所述编解码器之前出现异常,所述工作线程向所述控制器发送回滚事件;以及
    所述控制器响应于所述回滚事件,将所述第一视频流数据返还给输入缓存器。
  22. 根据权利要求20或21所述的方法,还包括:
    响应于所述编解码器内部处理异常,所述工作线程请求所述控制器将所述工作线程标记为僵尸线程。
  23. 根据权利要求20~22任一项所述的方法,还包括:
    响应于所述编解码器对所述第一视频流数据处理之后出现异常,所述工作线程向所述控制器发送丢包事件,所述控制器向所述第一客户端发送丢包提醒。
  24. 一种信息处理方法,包括:
    向服务器发送初始化申请,其中,所述初始化申请包括支持播放的视频流数据的播放参数;
    在所述服务器响应所述初始化申请之后,向所述服务器发送任务指令;以及
    接收所述服务器提供的第二视频流数据,并且播放所述第二视频流数据,其中,所述第二视频流数据为所述服务器对根据所述任务指令获取的第一视频流数据进行转换得到。
  25. 根据权利要求24所述的方法,其中,接收所述服务器提供的第二视频流数据,并且播放所述第二视频流数据,包括:
    接收所述服务器提供的所述第二视频流数据;
    对所述第二视频流数据进行解析,获取所述第二视频流数据中的播放基础信息;以及
    根据所述播放基础信息播放所述第二视频流数据。
  26. 一种信息处理装置,包括:
    第一接收单元,配置为接收第一客户端的第一初始申请并且响应所述第一初始申请,其中,所述第一初始申请包括所述第一客户端支持播放的视频流数据的播放参数;
    第二接收单元,配置为在响应所述第一初始申请之后,接收所述第一客户端的任务指令;
    指令获取单元,配置为根据所述任务指令,获取第一视频流数据;
    转换单元,配置为将所述第一视频流数据转换为第二视频流数据,使得所述第二视频流数据具有所述播放参数;以及
    提供单元,配置为向所述第一客户端提供所述第二视频流数据以便所述第二视频流数据在所述第一客户端播放。
  27. 一种信息处理装置,包括:
    申请发送单元,配置为向服务器发送初始化申请,其中,所述初始化申请包括支持播放的视频流数据的播放参数;
    指令发送单元,配置为在所述服务器响应所述初始化申请之后,向所述服务器发送任务指令;以及
    播放单元,配置为接收所述服务器提供的第二视频流数据,并且播放所述第二视频流数据,其中,所述第二视频流数据为所述服务器对根据所述任务指令获取的第一视频流数据进行转换得到。
  28. 一种电子设备,包括:
    处理器;
    存储器,包括一个或多个计算机程序指令;
    其中,所述一个或多个计算机程序指令被存储在所述存储器中,并由所述处理器执行时实现权利要求1-25任一项所述的信息处理方法的指令。
  29. 一种计算机可读存储介质,非暂时性存储有计算机可读指令,其中,当所述计算机可读指令由处理器执行时实现权利要求1-25任一项所述的信息处理方法。
PCT/CN2022/120543 2022-09-22 2022-09-22 信息处理方法、装置、电子设备和计算机可读存储介质 WO2024060134A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/120543 WO2024060134A1 (zh) 2022-09-22 2022-09-22 信息处理方法、装置、电子设备和计算机可读存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/120543 WO2024060134A1 (zh) 2022-09-22 2022-09-22 信息处理方法、装置、电子设备和计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2024060134A1 true WO2024060134A1 (zh) 2024-03-28

Family

ID=90453619

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/120543 WO2024060134A1 (zh) 2022-09-22 2022-09-22 信息处理方法、装置、电子设备和计算机可读存储介质

Country Status (1)

Country Link
WO (1) WO2024060134A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100097464A1 (en) * 2008-10-17 2010-04-22 Volpe Industries Network video surveillance system and recorder
CN110662100A (zh) * 2018-06-28 2020-01-07 中兴通讯股份有限公司 一种信息处理方法、装置、系统和计算机可读存储介质
CN110740296A (zh) * 2019-09-30 2020-01-31 视联动力信息技术股份有限公司 一种视联网监控视频流的处理方法及装置
CN114520925A (zh) * 2020-11-19 2022-05-20 西安诺瓦星云科技股份有限公司 视频流处理方法、设备、系统和云端服务器系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100097464A1 (en) * 2008-10-17 2010-04-22 Volpe Industries Network video surveillance system and recorder
CN110662100A (zh) * 2018-06-28 2020-01-07 中兴通讯股份有限公司 一种信息处理方法、装置、系统和计算机可读存储介质
CN110740296A (zh) * 2019-09-30 2020-01-31 视联动力信息技术股份有限公司 一种视联网监控视频流的处理方法及装置
CN114520925A (zh) * 2020-11-19 2022-05-20 西安诺瓦星云科技股份有限公司 视频流处理方法、设备、系统和云端服务器系统

Similar Documents

Publication Publication Date Title
US11653036B2 (en) Live streaming method and system, server, and storage medium
CN113423018B (zh) 一种游戏数据处理方法、装置及存储介质
WO2020103326A1 (zh) 一种一对多同屏方法、装置和系统、同屏设备及存储介质
CN108848060B (zh) 一种多媒体文件处理方法、处理系统及计算机可读存储介质
US10177958B2 (en) Method for synchronously taking audio and video in order to proceed one-to-multi multimedia stream
WO2020119823A1 (zh) 投屏方法、投屏装置、投屏设备
CN110430441B (zh) 一种云手机视频采集方法、系统、装置及存储介质
EP3267331A1 (en) Method and apparatus for cloud streaming service
Lei et al. Design and implementation of streaming media processing software based on RTMP
CN112261377B (zh) web版监控视频播放方法、电子设备及存储介质
WO2021042936A1 (zh) 视频数据的处理方法、装置、电子设备及计算机可读介质
US20200329388A1 (en) Miracast Framework Enhancements For Direct Streaming Mode
WO2024022317A1 (zh) 视频流的处理方法及装置、存储介质、电子设备
WO2024060134A1 (zh) 信息处理方法、装置、电子设备和计算机可读存储介质
CN112543374A (zh) 一种转码控制方法、装置及电子设备
CN104639979A (zh) 视频分享方法及系统
CN115865884A (zh) 一种网络摄像头数据访问装置、方法、网络摄像头和介质
CN118077189A (zh) 信息处理方法、装置、电子设备和计算机可读存储介质
CN111090818A (zh) 资源管理方法、资源管理系统、服务器及计算机存储介质
WO2018054349A1 (zh) 数据发送方法、数据接收方法及其装置和系统
CN112532719B (zh) 信息流的推送方法、装置、设备及计算机可读存储介质
CN114554277B (zh) 多媒体的处理方法、装置、服务器及计算机可读存储介质
CN108989767B (zh) 一种网络自适应的多路h264视频流存储转播方法及系统
WO2021217467A1 (zh) 一种智能摄像头的测试方法及装置
US11265357B2 (en) AV1 codec for real-time video communication

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22959143

Country of ref document: EP

Kind code of ref document: A1