WO2021051912A1 - Media data transmission method and related device - Google Patents

Media data transmission method and related device Download PDF

Info

Publication number
WO2021051912A1
WO2021051912A1 PCT/CN2020/097302 CN2020097302W WO2021051912A1 WO 2021051912 A1 WO2021051912 A1 WO 2021051912A1 CN 2020097302 W CN2020097302 W CN 2020097302W WO 2021051912 A1 WO2021051912 A1 WO 2021051912A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
target image
frame
video frame
image
Prior art date
Application number
PCT/CN2020/097302
Other languages
French (fr)
Chinese (zh)
Inventor
刘俊
杨胜凯
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021051912A1 publication Critical patent/WO2021051912A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/08Systems for the simultaneous or sequential transmission of more than one television signal, e.g. additional information signals, the signals occupying wholly or partially the same frequency band, e.g. by time division
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/765Interface circuits between an apparatus for recording and another apparatus
    • H04N5/77Interface circuits between an apparatus for recording and another apparatus between a recording apparatus and a television camera

Definitions

  • This application relates to the field of data processing, and in particular to a media data transmission method and related equipment.
  • the camera shoots a large amount of video data, and then transmits the video to the storage device for storage.
  • the camera also needs to transmit pictures to the storage device so that the storage device can perform operations such as image recognition.
  • the additional image transmission increases the network bandwidth and storage device space occupation.
  • the embodiments of the present application provide a media data transmission method and related equipment, which can transmit video and image location information, which effectively reduces the consumption of transmission bandwidth.
  • an embodiment of the present application provides a media data transmission method, the method including:
  • the camera generates multiple original video frames
  • the camera uses multiple original video frames to generate video
  • the camera obtains the position information of the target image in the video, where the target image is a video frame in the video or a part of the video frame in the video;
  • the camera sends the video and location information to the storage device.
  • the transmission bandwidth occupied by the location information is also much smaller than the transmission bandwidth occupied by the image, which reduces Consumption of bandwidth resources.
  • the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
  • the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video.
  • the second absolute position also includes the position of the target image in the corresponding video frame;
  • the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
  • the position information includes the absolute position and the relative position, and the absolute position and the relative position can be transmitted at the same time to verify each other, so as to avoid the loss of the video frame during the transmission process, which may cause errors.
  • the camera does not generate the target image, and does not send the target image to the storage device.
  • the camera does not generate the target image and does not send the target image to the storage device, so there is no need to transmit the target image separately, which reduces the consumption of bandwidth resources.
  • the method further includes:
  • the camera selects the target frame where the target image is located from the video
  • the camera obtains the position information of the target image in the video according to the target frame and the video.
  • the target image is located in the I frame in the GOP of the video.
  • the method further includes:
  • the camera obtains a target image from multiple original video frames, and the image quality of the target image is an image including the target feature in the multiple original video frames.
  • the method further includes: the storage device receives the video sent by the camera and the location information of the target image;
  • the storage device obtains the target image from the corresponding video frame of the video according to the location information.
  • the method further includes:
  • the storage device stores the video sent by the camera and the location information of the target image
  • the storage device obtains the target image from the corresponding video frame of the video according to the location information;
  • the storage device saves the target image
  • the storage device deletes the video.
  • the target image is obtained from the corresponding video frame of the video according to the location information and saved, so that the target image can be stored after the subsequent video is deleted, which can be used for subsequent image search. Provide the target image.
  • the method further includes:
  • the storage device receives the video stream sent by the camera and the location information of the target image, where the target image is a video frame in the video stream or a part of the video frame in the video stream;
  • the storage device obtains the target image from the corresponding video frame of the video stream according to the location information.
  • an embodiment of the present application provides a media data transmission method, the method including:
  • the storage device receives the video sent by the camera and the location information of the target image, where the target image is a video frame in the video or a part of the video frame in the video;
  • the storage device obtains the target image from the corresponding video frame of the video according to the location information.
  • the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
  • the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video.
  • the second absolute position also includes the position of the target image in the corresponding video frame;
  • the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
  • the storage device when the storage device receives a request to read the target image, the storage device obtains the corresponding video frame from the video according to the location information. To obtain the target image.
  • the storage device obtains the target image from the corresponding video frame of the video according to the location information; the storage device saves the target image , And delete the video.
  • the target image is obtained from the corresponding video frame in the video. Therefore, during the storage life cycle of the video, there is no need to store the target image.
  • the storage device only needs to store the video and location information.
  • the memory space occupied by the location information is much smaller than the memory space occupied by the target image, so storing the location information can reduce the consumption of storage resources relative to storing the target image.
  • the storage device does not store the target image during the storage life cycle of the video; after the storage life cycle of the video ends, the storage device stores the target image.
  • an embodiment of the present application provides a media data transmission device, and the device includes:
  • the first generating unit is used to generate multiple original video frames
  • the second generating unit is used to generate a video using multiple original video frames
  • An acquiring unit for acquiring position information of the target image in the video, where the target image is a video frame in the video or a part of the video frame in the video;
  • the sending unit is used to send the video and location information to the storage device.
  • the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
  • the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video.
  • the second absolute position also includes the position of the target image in the corresponding video frame;
  • the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
  • the media data transmission apparatus does not generate the target image, and does not send the target image to the storage device.
  • the camera selects the target frame where the target image is located from the video
  • the camera obtains the position information of the target image in the video according to the target frame and the video.
  • the target image is located in the I frame in the GOP of the video.
  • the target image is obtained from multiple original video frames, and the image quality of the target image is an image including the target feature in the multiple original video frames.
  • an embodiment of the present application provides a media data transmission device, and the device includes:
  • the receiving unit is used to receive the video sent by the camera and the location information of the target image, where the target image is a video frame in the video or a part of the video frame in the video;
  • the acquiring unit is used to acquire the target image from the corresponding video frame of the video according to the location information.
  • the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
  • the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video.
  • the second absolute position also includes the position of the target image in the corresponding video frame;
  • the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
  • the storage device when the storage device receives a request to read the target image, the storage device obtains the corresponding video frame from the video according to the location information. To obtain the target image.
  • the storage device obtains the target image from the corresponding video frame of the video according to the location information; the storage device saves the target Images, and delete videos.
  • an embodiment of the present application provides a camera, which includes:
  • the transceiver module is used to send the video and location information to the storage device.
  • the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
  • the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video.
  • the second absolute position also includes the position of the target image in the corresponding video frame;
  • the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
  • the camera does not generate the target image
  • the transceiver module does not send the target image to the storage device.
  • the position information of the target image in the video is obtained.
  • the target image is located in the I frame in the GOP of the video.
  • the target image is obtained from multiple original video frames, and the image quality of the target image is the image including the target feature in the multiple original video frames.
  • an embodiment of the present application provides a storage device, which includes:
  • the transceiver module is used to receive the video sent by the camera and the position information of the target image, where the target image is a video frame in the video or a part of the video frame in the video;
  • the processor is used to obtain the target image from the corresponding video frame of the video according to the position information.
  • the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
  • the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video.
  • the second absolute position also includes the position of the target image in the corresponding video frame;
  • the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
  • the storage device when the storage device receives a request to read the target image, the storage device obtains the corresponding video frame from the video according to the location information. To obtain the target image.
  • the sixth aspect in a possible embodiment of the sixth aspect, it is also used to: at the end of the storage life cycle of the video, obtain the target image from the corresponding video frame of the video according to the location information; save the target image, and delete video.
  • an embodiment of the present application provides a camera, which includes a processor, a transceiver, and a memory, and the processor executes the code in the memory to execute the method as in the first aspect.
  • an embodiment of the present application provides a storage device.
  • the storage device includes a processor, a transceiver, and a memory, and the processor executes code in the memory to execute the method in the second aspect.
  • an embodiment of the present application provides a computer-readable storage medium, and the computer-readable storage medium stores a computer program.
  • the computer program includes program instructions. Any of the methods of the second aspect.
  • a tenth aspect provides a computer program product.
  • the computer program product is read and executed by a computer, the method of any one of the first aspect and the second aspect will be executed.
  • FIG. 1 is a schematic diagram of a camera collecting images according to an embodiment of the application
  • FIG. 2 provides a schematic diagram of a large image and a small image according to an embodiment of the application
  • FIG. 3A provides a schematic diagram when a video frame is transmitted for an embodiment of this application
  • FIG. 3B provides a schematic diagram of video and location information transmitted by a camera according to an embodiment of this application
  • FIG. 4 is a schematic diagram of a storage device storing video and location information according to an embodiment of this application.
  • FIG. 5A provides a schematic diagram of an index table of a large image according to an embodiment of this application.
  • FIG. 5B provides a schematic diagram of an index table of thumbnails according to an embodiment of this application.
  • FIG. 6A provides a schematic diagram of a camera extracting a target image according to an embodiment of this application
  • FIG. 6B provides a schematic diagram of an index table for multiplex storage of thumbnails according to an embodiment of this application
  • FIG. 6C provides a schematic diagram of another index table for multiplex storage of small pictures according to an embodiment of this application.
  • FIG. 6D provides a schematic diagram of an index table of another large image according to an embodiment of the present application.
  • FIG. 7 is an interactive schematic diagram of a media data transmission method according to an embodiment of the application.
  • FIG. 8 is a schematic structural diagram of a media data transmission device according to an embodiment of the application.
  • FIG. 9 is a schematic structural diagram of a camera provided in an embodiment of the application.
  • FIG. 10 is a schematic structural diagram of another camera provided in an embodiment of this application.
  • FIG. 11 is a schematic structural diagram of a media data transmission device according to an embodiment of the application.
  • FIG. 12 is a schematic structural diagram of a storage device provided in an embodiment of this application.
  • FIG. 13 is a schematic structural diagram of a server provided in an embodiment of this application.
  • the camera will collect video according to the time axis t, where the video includes n frames of images I 1 , I 2 ,..., I n , and the image I 1 is the image collected by the camera at t 1.
  • I 2 is the image collected by the camera at time t 2
  • the image I n is the image collected by the camera at time t n.
  • the time interval between t 1 , t 2 ,..., t n can be equal or unequal, that is, t n -t n-1 , t n-1 -t n-2 ,..., t 2 -t 1 may be equal or unequal, and there is no specific limitation here.
  • the camera In addition to sending the video to the storage device, the camera also needs to select a target image from the video and send it to the storage device.
  • the target image can be a large image or a small image
  • the storage device can be a storage server, a video stream management platform, and so on.
  • the large image can be a complete image of a certain video frame, or an image that occupies an area of a certain video frame exceeding a preset threshold, and so on.
  • the big picture can contain multiple target subjects (the subject can be understood as target features).
  • the big picture can include a scene where a vehicle hits a pedestrian. Therefore, the big picture can be used to analyze the difference between different target subjects. Relationship and behavior.
  • the thumbnail can be a partial area of a certain video frame.
  • the thumbnail may include only a single target subject, or only a partial area of a single target subject.
  • the thumbnail may include the face of a pedestrian. Therefore, the thumbnail can be used to analyze a single target.
  • the target subject can be pedestrians, animals, vehicles, license plates, road signs, traffic lights, etc., which are not specifically limited here.
  • the small image can be obtained by extracting the regional image block with the target subject from the large image.
  • the extraction method can be an image feature extraction algorithm, specifically HOG (histogram of Oriented Gradient, directional gradient histogram), SIFT (Scale- invariant features transform, scale-invariant feature transformation), etc., which are not specifically limited here.
  • HOG hoverogram of Oriented Gradient, directional gradient histogram
  • SIFT Scale- invariant features transform, scale-invariant feature transformation
  • the camera can also use video compression algorithms to compress the video and image compression algorithms to compress the target image.
  • the video compression algorithm can be H.264, H.265, H.266, etc., are not specifically limited here.
  • the image compression algorithm can be JPEG, HEIF, etc., which is not specifically limited here.
  • the camera uses a video compression algorithm to compress the video and a picture compression algorithm to compress the target image. After compression, the bandwidth occupation is reduced. However, the bandwidth will still be consumed during transmission, especially when a large number of target images are transmitted, the consumption of bandwidth resources will still be huge. At the same time, after the transmission to the storage device, the storage device needs to consume memory resources. The target image is stored, which leads to a large consumption of storage resources. Therefore, how to reduce the transmission of the target image is a problem that needs to be solved.
  • the embodiment of the application aims to solve the above-mentioned large bandwidth consumption when transmitting the target image, and large consumption of memory resources and hard disk resources when storing the storage device.
  • the video video can be transmitted in a streaming mode, called Video stream
  • the location information of the target image are transmitted. Since the size of the location information is much smaller than the size of the image, the bandwidth occupied by the location information is also much smaller than the bandwidth occupied by the image, which reduces bandwidth resources.
  • Consumption After the storage device receives the location information of the video and the target image, the storage device stores the video and location information. The storage location information can also reduce the consumption of storage resources compared to storing the target image.
  • This application provides a media data transmission method and related equipment, which can effectively reduce the consumption of bandwidth resources and storage resources.
  • the video and location information can be transmitted at the same time or separately; they can be transmitted on the same channel or on different channels (for example, both use data channels, or video streams use data channels, and location information uses Management channel), there is no specific limitation here.
  • the following is a detailed introduction to the position information when the target image is a large image and a small image.
  • the position information of the target image may be the first absolute position or the first relative position, and may also include the first absolute position and the first relative position at the same time.
  • the first absolute position may include one or more of the frame number and time stamp of the target image in the video.
  • a video image comprising n frames I 1, I 2, ..., I n the target image may be an image frame 5
  • the first absolute position of the target image may be a frame number 5 in the video.
  • the first relative position may be an offset relative to a certain video frame, etc.
  • a video image comprising n frames I 1, I 2, ..., I n the target image may be an image of the fifth frame.
  • the first relative position can be the offset 4 of the target image relative to the first frame of image, or it can be that when the specific video frame is, for example, the compressed I frame of the video frame, the offset The amount can be the offset of the target image relative to the I frame, and so on.
  • the position information can be described by using the first absolute position. Specifically, the position information can be the frame number of the I frame.
  • the position information The first relative position may be used for description, which may specifically be the offset of the target image relative to the I frame.
  • the position information of the target image may also include the position of the target image in the video frame, and the position includes coordinates and size (see the following description for details).
  • the position information of the target image may be the second absolute position or the second relative position, or include the second absolute position and the second relative position at the same time.
  • the second absolute position may include one or more of the frame number and time stamp of the video frame corresponding to the target image in the video, and the position of the target image in the video frame.
  • the position includes coordinates and size.
  • the target image The coordinates of the video frame can be expressed as (x, y), x is the horizontal coordinate, y is the vertical coordinate, the size of the target image in the video frame can be expressed as mXn, m is the horizontal size, and n is the vertical size.
  • the second relative position includes the absolute position of the video frame corresponding to the target image and the relative position of the target image in the corresponding video frame, the relative position of the video frame corresponding to the target image and the absolute position of the target image in the corresponding video frame, the target image
  • the relative position of the video frame corresponding to the image and the relative position of the target image in the corresponding video frame can be an offset relative to the position of a specific mark.
  • the video frame is the image of the target pedestrian visiting Tiananmen Square
  • the target image is the target pedestrian
  • the target image is in the video
  • the relative position in the frame may be the direction and distance of the target pedestrian relative to Tiananmen Square.
  • the position of the target pedestrian on the east side of Tiananmen Square at a distance of 100 meters is expressed as (East, 100).
  • the location information of the target image may also include the frame category of the video frame in which the target image is located during transmission after compression.
  • the frame category includes I frame and P frame.
  • I frame and P frame are obtained.
  • I frame can be a complete video frame
  • I frame can be understood as a key frame
  • P frame is the difference between this frame and the previous key frame.
  • I frame is An image of a certain moment when a car is driving, then the P frame can be the position offset of the car at the next moment relative to the previous moment, etc.
  • the target image can be an I frame or a P frame.
  • the camera When the camera sends video and location information to the storage device, it can also send the associated information of the target image.
  • the associated information includes the acquisition time of the video, the acquisition time of the video frame corresponding to the target image, the camera identification, the category of the target image, and the target image The sequence number, the frame number of the video frame corresponding to the target image, and the offset of the video frame corresponding to the target image.
  • the target image categories include large images and small images, and the time when the video is collected can be the start time of the video, etc.
  • the absolute position and the relative position can be transmitted at the same time to verify each other, so as to avoid the loss of the video frame during the transmission process, which may lead to errors.
  • the position information transmitted between the camera and the storage device is compressed.
  • the camera may compress the original position information, and send the compressed position information to the storage device.
  • the original location information may also be part or all of the location information described above.
  • the original position information is part of the position information in the above position information, it includes at least the position information of the target image and so on.
  • Figure 4 shows a specific example when the storage device stores the received video and location information.
  • the storage device After receiving the video and location information, the storage device stores the video in the storage space corresponding to the video. Store location information.
  • the position information can also be further processed to obtain an index table of the target image, and store the index table.
  • the position information (or index table) can represent the position information of the target picture in the video.
  • the index table includes the index table of the big picture and the index table of the small picture.
  • the index table of the big picture and the index table of the small picture can be stored separately. That is, store all the large-picture index tables in one memory space, and store all the small-picture index tables in another storage space.
  • obtaining the index table of the target image may specifically be: extracting the position information of the target image, and generating a template according to the preset index table to generate the index table of the target image.
  • extracting the location information of the target image it can be extracted from the cache or from the memory.
  • the preset index table generation template may be a preset template.
  • FIG. 5A shows a schematic diagram of an index table of a large image
  • FIG. 5B shows a schematic diagram of an index table of a small image.
  • the index table of the big picture includes the camera ID, the collection time of the video frame corresponding to the big picture, the frame number of the video frame, the picture type, the picture sequence number, the video frame type, the video frame offset, etc.
  • the content of the index table can be directly extracted from the received location information.
  • the index table of the thumbnail includes the camera ID, the collection time of the video frame in which the thumbnail is located, the frame number of the video frame, the picture type, the picture sequence number, the video frame type, the video frame offset, and the time when the thumbnail is located.
  • the offset of the thumbnail in the video frame is expressed in the form of coordinates.
  • the size of the thumbnail in the video frame is expressed by the horizontal size and the vertical size. For example, 80X80 means that the horizontal size is 80 and the vertical size is 80.
  • the above index The content of the table can be directly extracted from the received location information.
  • the storage device does not need to store the target image separately. After the storage device receives the request from the host to read the target image, it uses the index table to obtain the corresponding target image from the video and sends it to the host.
  • the storage device After the storage device stores the video, it sets a storage life cycle for the video, and deletes the video after the storage life cycle ends.
  • the storage life cycle can be specifically understood as a fixed duration.
  • the storage device can extract the target image from the video at the end of the video storage life cycle, store the target image in the corresponding storage space, and update the picture index table ( The updated index table is used to describe the storage location of the target image in the storage device).
  • the end of the storage life cycle is a condition that triggers the step of extracting the target image from the video. After the extraction of the target image is completed, the storage device deletes the video.
  • the end of the life cycle includes: the life cycle is about to reach the end time point, or a short time after the end time point of the life cycle.
  • the storage device can also extract the target image from the video before the end of the storage life cycle of the video, for example, perform the extraction operation and store it within 10 minutes before the end of the storage life cycle, and after the end of the storage life cycle , The storage device can delete the video immediately.
  • the target image When extracting the target image from the video according to the index table of the target image, it may be specifically: acquiring at least one video corresponding to the camera identifier according to the camera identifier, and then determining the target from the at least one video according to the video acquisition time Video, the target video includes the target image. According to the time of the video frame in the index table, the type of the video frame, the frame number of the video frame in the video, and the video frame offset, the video frame is extracted from the target video, and the position information of the target image is extracted. , Obtain the target image from the video frame. If the target image is a large image, the video frame can be determined as the target image (here, the large image is a complete video frame as an example).
  • FIG. 6A shows a schematic diagram of extracting a target image.
  • the storage device proposes n videos from the storage space corresponding to the video according to the camera ID, video 1, video 2, ..., video n-1, video n, and determines m videos from n videos according to the camera ID, and video k, ..., video j, the target video is determined from the m videos according to the video acquisition time, and then the target image is determined from the n video frames of the target video according to other information in the index table, and other information includes the information of the video frame Time, video frame type, frame number of video frame, video frame offset, position information of target image (first absolute position and/or first relative position, second absolute position and/or second relative position), etc.
  • a target image from a video When extracting a target image from a video, first determine whether there is a target image in the video. If there is a target image, extract the target image from the video according to the index table of the target image and store it in the image storage space. To determine whether there is a target image in the video, you can determine it according to the index table. Specifically, it can be determined by the frame number of the video frame in the index table and the acquisition time of the video frame to determine whether the video frame exists in the video. If it exists, then determine If there is a target image in the video, if it does not exist, it is determined that there is no target image in the video.
  • the large image and the small image can be stored separately or multiplexed. Multiplexing storage can be understood as when the target image is a small image, storing the large image or video frame where the target image is located, and storing the location information of the small image in the large image or video frame, thereby realizing the small image and the large image/video The effect that frames are stored.
  • different encoding formats may be adopted to encode the target image and then stored, and the encoding format may be the HEIF format.
  • the updated index table is used to describe the storage location of the target image, the picture type of the target image, and the encoding format of the target image during storage.
  • FIG. 6B shows a schematic diagram of the index table of the thumbnail when the thumbnail is multiplexed and stored. At this time, the video frame corresponding to the thumbnail is multiplexed when the thumbnail is stored.
  • Figure 6C shows that the small image is multiplexed and stored, and the large image corresponding to the small image is multiplexed when the small image is stored.
  • FIG. 6D shows a schematic diagram of the index table of the updated large image.
  • the thumbnail storage types include 0 and 1, 1 means the thumbnail is multiplexed and stored, and 0 means the thumbnail is stored separately.
  • the storage device After the storage device stores the video, if it receives a request to read the target image, it can extract the target image from the video or extract the target image from the storage space of the target image, and feed back the target image to the requesting party.
  • the request to read the target image is extracted from the video according to the index table during the storage life cycle of the video; the request to read the target image is retrieved from the target according to the index table after the video storage life cycle ends.
  • the target image is extracted from the image storage space. After the target image is extracted, the target image is fed back to the requesting party.
  • the picture format of the target image can be converted to the picture format corresponding to the requesting party, for example,
  • the requester requests the JPEG format
  • the format of the target image is converted to the JPEG format.
  • the storage device can also receive the video stream and the location information of the target image sent by the camera, and the storage device obtains the target image from the corresponding video frame in the video stream according to the location information.
  • the location information of the video and the target image, as well as the implementation of obtaining the target image from the video, will not be repeated here.
  • FIG. 7 is an interactive schematic diagram of a media data transmission method provided in an embodiment of this application.
  • the data transmission method of this embodiment includes the following steps:
  • the camera acquires position information of a target image in the video, where the target image is a video frame in the video or a part of the video frame in the video.
  • the target image includes a large image and/or a small image.
  • the large image can be a complete image of a certain video frame, or an image that occupies an area of a certain video frame that exceeds a preset threshold, etc.; the small image can be of a certain video frame partial area.
  • the thumbnail may only include a single target subject, or only a partial area of a single target subject.
  • the position information includes an absolute position and a relative position.
  • the absolute position may be, for example, the frame number of a video frame, a time stamp, etc.
  • the relative position may be, for example, an offset relative to a specific video frame.
  • the camera Before acquiring the position information of the target image in the video, the camera generates multiple original video frames, and the camera uses the multiple original video frames to generate the video.
  • the camera sends the video and location information to the storage device.
  • the camera When the camera sends the video and location information to the storage device, it can be sent simultaneously or non-simultaneously.
  • the storage device receives the video sent by the camera and the location information of the target image, where the target image is a video frame in the video or a part of the video frame in the video.
  • the storage device obtains a target image from a corresponding video frame of the video according to the location information.
  • the storage device When the storage device obtains the target image according to the location information, it can obtain the target image according to the index table that carries the location information. Specifically, it can obtain the target image from the corresponding video frame in the video according to the index table, or it can obtain the target image from the target according to the index table. Obtain the target image from the image storage space.
  • the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
  • the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video.
  • the second absolute position also includes the position of the target image in the corresponding video frame, and the position includes coordinates and size; the second relative position includes the offset of the target image relative to the specific video frame, and the target image is in the corresponding video frame. The coordinates and size of the video frame.
  • the camera does not generate the target image, and does not send the target image to the storage device.
  • it further includes:
  • the camera selects the target frame where the target image is located from the video
  • the camera obtains the position information of the target image in the video according to the target frame and the video.
  • the target frame can be understood as the video frame where the target image is located in the foregoing embodiment.
  • the target image is located in the I frame in the GOP of the video.
  • the I frame in the GOP of the video is a key frame.
  • the camera obtains a target image from a plurality of original video frames, and the image quality of the target image is an image including the target feature in the plurality of original video frames.
  • the target feature can be understood as a specific feature, for example, the behavior among multiple subjects.
  • the storage device receives the video sent by the camera and the location information of the target image
  • the storage device obtains the target image from the corresponding video frame of the video according to the location information.
  • it further includes:
  • the storage device stores the video sent by the camera and the location information of the target image
  • the storage device obtains the target image from the corresponding video frame of the video according to the location information;
  • the storage device saves the target image
  • the storage device deletes the video.
  • it further includes:
  • the storage device receives the video stream sent by the camera and the location information of the target image, where the target image is a video frame in the video stream or a part of the video frame in the video stream;
  • the storage device obtains the target image from the corresponding video frame of the video stream according to the location information.
  • the present embodiment does not have definitions of large images, small images, location information, index tables, etc., for expanded description.
  • Figure 2 Figure 3A, Figure 3B, Figure 5A, and Figure 5B, etc. and related large images. Descriptions of pictures, thumbnails, location information, index tables, definitions of specific video frames, etc.
  • This embodiment also does not introduce the collection of video by the camera, the transmission of the video, etc.
  • For other terms and definitions please refer to the content described in the foregoing embodiment.
  • FIG. 8 is a schematic structural diagram of a media data transmission device provided in this application.
  • the media data transmission device 800 of the embodiment of the present application includes:
  • the first generating unit 810 is configured to generate multiple original video frames
  • the second generating unit 820 is configured to generate a video using multiple original video frames
  • the obtaining unit 830 is configured to obtain location information of the target image in the video, where the target image is a video frame in the video or a part of the video frame in the video;
  • the sending unit 840 is used to send the video and location information to the storage device.
  • the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or There are multiple types, the first relative position includes the offset of the target image relative to the specific video frame;
  • the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video.
  • the second absolute position also includes the position of the target image in the corresponding video frame;
  • the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
  • the media data transmission apparatus does not generate the target image, and does not send the target image to the storage device.
  • the position information of the target image in the video is obtained.
  • the target image is located in the I frame in the GOP of the video.
  • it is also used to: obtain a target image from multiple original video frames, and the image quality of the target image is an image including the target feature in the multiple original video frames.
  • FIG. 9 is a schematic structural diagram of a camera provided in this application.
  • the camera 900 in the embodiment of the present application includes a processor 910 and a transceiver module 920, where:
  • the processor 910 is configured to generate multiple original video frames, and use the multiple original video frames to generate a video; obtain position information of the target image in the video, where the target image is a video frame in the video or a part of the video frame in the video;
  • the transceiver module 920 is used to send the video and location information to the storage device.
  • the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
  • the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video.
  • the second absolute position also includes the position of the target image in the corresponding video frame;
  • the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
  • the processor 910 does not generate the target image
  • the transceiver module 920 does not send the target image to the storage device.
  • the position information of the target image in the video is obtained.
  • the target image is located in the I frame in the GOP of the video.
  • it is also used to: obtain a target image from multiple original video frames, and the image quality of the target image is an image including the target feature in the multiple original video frames.
  • an embodiment of the present application further provides a camera 1000.
  • the camera 1000 includes a processor 1010, a memory 1020, and a transceiver 1030.
  • the memory 1020 stores instructions or programs, and the processor 1010 is configured to execute the memory 1020. Instruction or program stored in.
  • the processor 1010 is used to perform the operations performed by the processor 920 in the foregoing embodiment
  • the transceiver 1030 is used to perform the operations performed by the transceiver module 902 in the foregoing embodiment.
  • the media data transmission device 1100 provided in the embodiment of the present application includes:
  • the receiving unit 1110 is configured to receive the video sent by the camera and the position information of the target image, where the target image is a video frame in the video or a part of the video frame in the video;
  • the obtaining unit 1120 is configured to obtain a target image from a corresponding video frame of the video according to the location information.
  • the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or There are multiple types, the first relative position includes the offset of the target image relative to the specific video frame;
  • the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video.
  • the second absolute position also includes the position of the target image in the corresponding video frame;
  • the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
  • the target image is obtained from the corresponding video frame in the video according to the location information.
  • it is also used to: at the end of the storage life cycle of the video, obtain the target image from the corresponding video frame of the video according to the location information; save the target image, and delete the video.
  • the storage device 1200 provided in the embodiment of the present application includes a transceiver module 1210 and a processor 1220:
  • the transceiver module 1210 is used to receive the video sent by the camera and the location information of the target image, where the target image is a video frame in the video or a part of the video frame in the video;
  • the processor 1220 is configured to obtain a target image from a corresponding video frame of the video according to the location information.
  • the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
  • the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video.
  • the second absolute position also includes the position of the target image in the corresponding video frame;
  • the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
  • the storage device 1200 when the storage device 1200 receives a request to read the target image, the storage device 1200 obtains the target image from the corresponding video frame in the video according to the location information.
  • it is also used to: at the end of the storage life cycle of the video, obtain the target image from the corresponding video frame of the video according to the location information; save the target image, and delete the video.
  • an embodiment of the present application further provides a server 1300.
  • the server 1300 includes a processor 1310, a memory 1320, and a transceiver 1330.
  • the memory 1320 stores instructions or programs, and the processor 1310 is configured to execute the memory 1320. Instruction or program stored in.
  • the processor 1310 is used to perform the operations performed by the processor 1220 in the foregoing embodiment
  • the transceiver 1330 is used to perform the operations performed by the transceiver module 1210 in the foregoing embodiment.
  • the embodiments of the present application also provide a computer-readable storage medium, wherein the computer-readable storage medium can store a program, and when the program is executed, it includes part or all of any of the media data transmission methods described in the above method embodiments. step.
  • the embodiments of the present application also provide a program product, wherein when the computer program product is read and executed by a computer, part or all of the steps of any media data transmission method recorded in the above method embodiments will be executed.
  • the computer may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software it can be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium, (for example, a floppy disk, a storage disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a Solid State Disk (SSD)).

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Studio Devices (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

Disclosed are a media data transmission method and a related device. The media data transmission method comprises: a camera generating a plurality of original video frames; the camera generating a video by using the plurality of original video frames; the camera acquiring information of the position of a target image in the video, wherein the target image is a video frame in the video or part of the video frame in the video; and the camera sending the video and the position information to a storage device. The video and the position information of the target image can be transmitted, such that the consumption of a transmission bandwidth is effectively reduced.

Description

媒体数据传输方法及相关设备Media data transmission method and related equipment 技术领域Technical field
本申请涉及数据处理领域,尤其涉及一种媒体数据传输方法及相关设备。This application relates to the field of data processing, and in particular to a media data transmission method and related equipment.
背景技术Background technique
摄像机拍摄大量视频数据,然后把视频传输给存储设备进行存储。The camera shoots a large amount of video data, and then transmits the video to the storage device for storage.
随着摄像机智能化的推动,除了视频之外,摄像机还需要向存储设备传输图片,以便存储设备进行图像识别等操作。额外的图片传输增加了网络带宽的占用和存储设备的空间占用。With the advancement of camera intelligence, in addition to video, the camera also needs to transmit pictures to the storage device so that the storage device can perform operations such as image recognition. The additional image transmission increases the network bandwidth and storage device space occupation.
发明内容Summary of the invention
本申请实施例提供一种媒体数据传输方法及相关设备,可以对视频和图像的位置信息进行传输,有效的减少了传输带宽的消耗。The embodiments of the present application provide a media data transmission method and related equipment, which can transmit video and image location information, which effectively reduces the consumption of transmission bandwidth.
第一方面,本申请实施例提供一种媒体数据传输方法,该方法包括:In the first aspect, an embodiment of the present application provides a media data transmission method, the method including:
摄像机生成多个原始视频帧;The camera generates multiple original video frames;
摄像机使用多个原始视频帧生成视频;The camera uses multiple original video frames to generate video;
摄像机获取目标图像在视频中的位置信息,其中,目标图像是视频中的视频帧或视频中视频帧的部分;The camera obtains the position information of the target image in the video, where the target image is a video frame in the video or a part of the video frame in the video;
摄像机将视频以及位置信息发送给存储设备。The camera sends the video and location information to the storage device.
本示例中,通过对视频和目标图像的位置信息进行传输,由于位置信息的大小远远小于图像的大小,因此位置信息所占用的传输带宽也远远小于图像所占用的传输带宽,则减少了带宽资源的消耗。In this example, by transmitting the location information of the video and the target image, since the size of the location information is much smaller than the size of the image, the transmission bandwidth occupied by the location information is also much smaller than the transmission bandwidth occupied by the image, which reduces Consumption of bandwidth resources.
结合第一方面,在第一方面的一个可能的实施例中,With reference to the first aspect, in a possible embodiment of the first aspect,
在目标图像的类别为大图的情况下,位置信息包括第一绝对位置和/或第一相对位置,其中,第一绝对位置包括目标图像在视频中的帧号以及时间戳中的一种或者多种,第一相对位置包括目标图像相对于特定视频帧的偏移量。In the case that the target image category is a large image, the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
结合第一方面,在第一方面的一个可能的实施例中,With reference to the first aspect, in a possible embodiment of the first aspect,
在目标图像的类别为小图的情况下,位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在视频中的帧号、时间戳中的一种或者多种,第二绝对位置还包括目标图像在对应的视频帧中的位置;第二相对位置包括目标图像相对于特定视频帧的偏移量,目标图像在对应的视频帧中的位置。When the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video. The second absolute position also includes the position of the target image in the corresponding video frame; the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
本示例中,位置信息包括绝对位置和相对位置,可以同时传输绝对位置和相对位置,以相互之间进行校验,避免因为视频帧在传输的过程中发生丢失,从而导致发生错误。In this example, the position information includes the absolute position and the relative position, and the absolute position and the relative position can be transmitted at the same time to verify each other, so as to avoid the loss of the video frame during the transmission process, which may cause errors.
结合第一方面,在第一方面的一个可能的实施例中,摄像机不生成目标图像,不向存储设备发送目标图像。With reference to the first aspect, in a possible embodiment of the first aspect, the camera does not generate the target image, and does not send the target image to the storage device.
本示例中,摄像机不生成目标图像,以及不向存储设备发送目标图像,因此无需单独对目标图像进行传输,减少了带宽资源的消耗。In this example, the camera does not generate the target image and does not send the target image to the storage device, so there is no need to transmit the target image separately, which reduces the consumption of bandwidth resources.
结合第一方面,在第一方面的一个可能的实施例中,还包括:With reference to the first aspect, in a possible embodiment of the first aspect, the method further includes:
摄像机从视频中选择目标图像所在的目标帧;The camera selects the target frame where the target image is located from the video;
摄像机根据目标帧以及视频,获取目标图像在视频中的位置信息。The camera obtains the position information of the target image in the video according to the target frame and the video.
结合第一方面,在第一方面的一个可能的实施例中,With reference to the first aspect, in a possible embodiment of the first aspect,
目标图像位于视频的图像组GOP中的I帧。The target image is located in the I frame in the GOP of the video.
结合第一方面,在第一方面的一个可能的实施例中,还包括:With reference to the first aspect, in a possible embodiment of the first aspect, the method further includes:
摄像机从多个原始视频帧中获取目标图像,目标图像的图像质量为多个原始视频帧中包括目标特征的图像。The camera obtains a target image from multiple original video frames, and the image quality of the target image is an image including the target feature in the multiple original video frames.
结合第一方面,在第一方面的一个可能的实施例中,还包括:存储设备接收摄像机发送的视频以及目标图像的位置信息;With reference to the first aspect, in a possible embodiment of the first aspect, the method further includes: the storage device receives the video sent by the camera and the location information of the target image;
存储设备根据位置信息从视频的对应视频帧中获取目标图像。The storage device obtains the target image from the corresponding video frame of the video according to the location information.
结合第一方面,在第一方面的一个可能的实施例中,还包括:With reference to the first aspect, in a possible embodiment of the first aspect, the method further includes:
存储设备存储摄像机发送的视频以及目标图像的位置信息;The storage device stores the video sent by the camera and the location information of the target image;
在视频的存储生命周期结束时,存储设备根据位置信息从视频的对应视频帧中获取目标图像;At the end of the storage life cycle of the video, the storage device obtains the target image from the corresponding video frame of the video according to the location information;
存储设备保存目标图像;The storage device saves the target image;
存储设备删除视频。The storage device deletes the video.
本示例中,在视频的存储生命周期结束时,根据位置信息从视频的对应视频帧中获取目标图像,并进行保存,从而可以在后续视频删除后存储目标图像,可以为后续进行图像查找时,提供目标图像。In this example, at the end of the storage life cycle of the video, the target image is obtained from the corresponding video frame of the video according to the location information and saved, so that the target image can be stored after the subsequent video is deleted, which can be used for subsequent image search. Provide the target image.
结合第一方面,在第一方面的一个可能的实施例中,还包括:With reference to the first aspect, in a possible embodiment of the first aspect, the method further includes:
存储设备接收摄像机发送的视频流以及目标图像的位置信息,其中,目标图像是视频流中的视频帧或视频流中视频帧的部分;The storage device receives the video stream sent by the camera and the location information of the target image, where the target image is a video frame in the video stream or a part of the video frame in the video stream;
存储设备根据位置信息从视频流的对应视频帧中获取目标图像。The storage device obtains the target image from the corresponding video frame of the video stream according to the location information.
第二方面,本申请实施例提供一种媒体数据传输方法,该方法包括:In a second aspect, an embodiment of the present application provides a media data transmission method, the method including:
存储设备接收摄像机发送的视频以及目标图像的位置信息,其中,目标图像是视频中的视频帧或视频中视频帧的部分;The storage device receives the video sent by the camera and the location information of the target image, where the target image is a video frame in the video or a part of the video frame in the video;
存储设备根据位置信息从视频的对应视频帧中获取目标图像。The storage device obtains the target image from the corresponding video frame of the video according to the location information.
结合第二方面,在第二方面的一个可能的实施例中,With reference to the second aspect, in a possible embodiment of the second aspect,
在目标图像的类别为大图的情况下,位置信息包括第一绝对位置和/或第一相对位置,其中,第一绝对位置包括目标图像在视频中的帧号以及时间戳中的一种或者多种,第一相对位置包括目标图像相对于特定视频帧的偏移量。In the case that the target image category is a large image, the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
结合第二方面,在第二方面的一个可能的实施例中,With reference to the second aspect, in a possible embodiment of the second aspect,
在目标图像的类别为小图的情况下,位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在视频中的帧号、时间戳中的一种或者多种,第二绝对位置还包括目标图像在对应的视频帧中的位置;第二相对位置包括目标图像相对于特定视频帧的偏移量,目标图像在对应的视频帧中的位置。When the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video. The second absolute position also includes the position of the target image in the corresponding video frame; the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
结合第二方面,在第二方面的一个可能的实施例中,在视频的存储生命周期内,当存储设备收到读取目标图像的请求,则存储设备根据位置信息从视频中对应的视频帧中获取目标图像。In combination with the second aspect, in a possible embodiment of the second aspect, during the storage life cycle of the video, when the storage device receives a request to read the target image, the storage device obtains the corresponding video frame from the video according to the location information. To obtain the target image.
结合第二方面,在第二方面的一个可能的实施例中,还包括:在视频的存储生命周期结束时,存储设备根据位置信息从视频的对应视频帧中获取目标图像;存储设备保存目标图像,以及删除视频。With reference to the second aspect, in a possible embodiment of the second aspect, it further includes: at the end of the storage life cycle of the video, the storage device obtains the target image from the corresponding video frame of the video according to the location information; the storage device saves the target image , And delete the video.
本示例中,在视频的存储生命周期内,从视频中对应的视频帧中获取目标图像,因此在视频的存储生命周期内,无需对目标图像进行存储,存储设备仅需存储视频和位置信息,位置信息所占用的内存空间远远小于目标图像占用的内存空间,因此存储位置信息相对于存储目标图像能减少存储资源的消耗。In this example, during the storage life cycle of the video, the target image is obtained from the corresponding video frame in the video. Therefore, during the storage life cycle of the video, there is no need to store the target image. The storage device only needs to store the video and location information. The memory space occupied by the location information is much smaller than the memory space occupied by the target image, so storing the location information can reduce the consumption of storage resources relative to storing the target image.
结合第二方面,在第二方面的一个可能的实施例中,在视频的存储生命周期内存储设备不存储目标图像;在视频的存储生命周期结束后,存储设备存储目标图像。With reference to the second aspect, in a possible embodiment of the second aspect, the storage device does not store the target image during the storage life cycle of the video; after the storage life cycle of the video ends, the storage device stores the target image.
本示例中,在视频的存储生命周期内,无需对目标图像进行存储,能减少存储资源的消耗。In this example, during the storage life cycle of the video, there is no need to store the target image, which can reduce the consumption of storage resources.
第三方面,本申请实施例提供一种媒体数据传输装置,装置包括:In a third aspect, an embodiment of the present application provides a media data transmission device, and the device includes:
第一生成单元,用于生成多个原始视频帧;The first generating unit is used to generate multiple original video frames;
第二生成单元,用于使用多个原始视频帧生成视频;The second generating unit is used to generate a video using multiple original video frames;
获取单元,用于获取目标图像在视频中的位置信息,其中,目标图像是视频中的视频帧或视频中视频帧的部分;An acquiring unit for acquiring position information of the target image in the video, where the target image is a video frame in the video or a part of the video frame in the video;
发送单元,用于机将视频以及位置信息发送给存储设备。The sending unit is used to send the video and location information to the storage device.
结合第三方面,在第三方面的一个可能的实施例中,With reference to the third aspect, in a possible embodiment of the third aspect,
在目标图像的类别为大图的情况下,位置信息包括第一绝对位置和/或第一相对位置,其中,第一绝对位置包括目标图像在视频中的帧号以及时间戳中的一种或者多种,第一相对位置包括目标图像相对于特定视频帧的偏移量。In the case that the target image category is a large image, the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
结合第三方面,在第三方面的一个可能的实施例中,With reference to the third aspect, in a possible embodiment of the third aspect,
在目标图像的类别为小图的情况下,位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在视频中的帧号、时间戳中的一种或者多种,第二绝对位置还包括目标图像在对应的视频帧中的位置;第二相对位置包括目标图像相对于特定视频帧的偏移量,目标图像在对应的视频帧中的位置。When the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video. The second absolute position also includes the position of the target image in the corresponding video frame; the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
结合第三方面,在第三方面的一个可能的实施例中,媒体数据传输装置不生成目标图像,不向存储设备发送目标图像。With reference to the third aspect, in a possible embodiment of the third aspect, the media data transmission apparatus does not generate the target image, and does not send the target image to the storage device.
结合第三方面,在第三方面的一个可能的实施例中,还包括:With reference to the third aspect, in a possible embodiment of the third aspect, it further includes:
摄像机从视频中选择目标图像所在的目标帧;The camera selects the target frame where the target image is located from the video;
摄像机根据目标帧以及视频,获取目标图像在视频中的位置信息。The camera obtains the position information of the target image in the video according to the target frame and the video.
结合第三方面,在第三方面的一个可能的实施例中,With reference to the third aspect, in a possible embodiment of the third aspect,
目标图像位于视频的图像组GOP中的I帧。The target image is located in the I frame in the GOP of the video.
结合第三方面,在第三方面的一个可能的实施例中,还用于:In combination with the third aspect, in a possible embodiment of the third aspect, it is also used to:
从多个原始视频帧中获取目标图像,目标图像的图像质量为多个原始视频帧中包括目标特征的图像。The target image is obtained from multiple original video frames, and the image quality of the target image is an image including the target feature in the multiple original video frames.
第四方面,本申请实施例提供一种媒体数据传输装置,装置包括:In a fourth aspect, an embodiment of the present application provides a media data transmission device, and the device includes:
接收单元,用于接收摄像机发送的视频以及目标图像的位置信息,其中,目标图像是视频中的视频帧或视频中视频帧的部分;The receiving unit is used to receive the video sent by the camera and the location information of the target image, where the target image is a video frame in the video or a part of the video frame in the video;
获取单元,用于根据位置信息从视频的对应视频帧中获取目标图像。The acquiring unit is used to acquire the target image from the corresponding video frame of the video according to the location information.
结合第四方面,在第四方面的一个可能的实施例中,With reference to the fourth aspect, in a possible embodiment of the fourth aspect,
在目标图像的类别为大图的情况下,位置信息包括第一绝对位置和/或第一相对位置, 其中,第一绝对位置包括目标图像在视频中的帧号以及时间戳中的一种或者多种,第一相对位置包括目标图像相对于特定视频帧的偏移量。In the case that the target image category is a large image, the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
结合第四方面,在第四方面的一个可能的实施例中,With reference to the fourth aspect, in a possible embodiment of the fourth aspect,
在目标图像的类别为小图的情况下,位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在视频中的帧号、时间戳中的一种或者多种,第二绝对位置还包括目标图像在对应的视频帧中的位置;第二相对位置包括目标图像相对于特定视频帧的偏移量,目标图像在对应的视频帧中的位置。When the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video. The second absolute position also includes the position of the target image in the corresponding video frame; the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
结合第四方面,在第四方面的一个可能的实施例中,在视频的存储生命周期内,当存储设备收到读取目标图像的请求,则存储设备根据位置信息从视频中对应的视频帧中获取目标图像。With reference to the fourth aspect, in a possible embodiment of the fourth aspect, during the storage life cycle of the video, when the storage device receives a request to read the target image, the storage device obtains the corresponding video frame from the video according to the location information. To obtain the target image.
结合第四方面,在第四方面的一个可能的实施例中,还用于:在视频的存储生命周期结束时,存储设备根据位置信息从视频的对应视频帧中获取目标图像;存储设备保存目标图像,以及删除视频。With reference to the fourth aspect, in a possible embodiment of the fourth aspect, it is also used to: at the end of the storage life cycle of the video, the storage device obtains the target image from the corresponding video frame of the video according to the location information; the storage device saves the target Images, and delete videos.
结合第四方面,在第四方面的一个可能的实施例中,还用于:With reference to the fourth aspect, in a possible embodiment of the fourth aspect, it is also used to:
接收摄像机发送的视频流以及目标图像的位置信息,其中,目标图像是视频流中的视频帧或视频流中视频帧的部分;Receive the video stream sent by the camera and the location information of the target image, where the target image is a video frame in the video stream or a part of the video frame in the video stream;
根据位置信息从视频流的对应视频帧中获取目标图像。Obtain the target image from the corresponding video frame of the video stream according to the location information.
第五方面,本申请实施例提供一种摄像机,该摄像机包括:In a fifth aspect, an embodiment of the present application provides a camera, which includes:
处理器,用于生成多个原始视频帧,使用多个原始视频帧生成视频;获取目标图像在视频中的位置信息,其中,目标图像是视频中的视频帧或视频中视频帧的部分;A processor for generating multiple original video frames, using multiple original video frames to generate a video; acquiring position information of the target image in the video, where the target image is a video frame in the video or a part of the video frame in the video;
收发模块,用于将视频以及位置信息发送给存储设备。The transceiver module is used to send the video and location information to the storage device.
结合第五方面,在第五方面的一个可能的实施例中,With reference to the fifth aspect, in a possible embodiment of the fifth aspect,
在目标图像的类别为大图的情况下,位置信息包括第一绝对位置和/或第一相对位置,其中,第一绝对位置包括目标图像在视频中的帧号以及时间戳中的一种或者多种,第一相对位置包括目标图像相对于特定视频帧的偏移量。In the case that the target image category is a large image, the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
结合第五方面,在第五方面的一个可能的实施例中,With reference to the fifth aspect, in a possible embodiment of the fifth aspect,
在目标图像的类别为小图的情况下,位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在视频中的帧号、时间戳中的一种或者多种,第二绝对位置还包括目标图像在对应的视频帧中的位置;第二相对位置包括目标图像相对于特定视频帧的偏移量,目标图像在对应的视频帧中的位置。When the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video. The second absolute position also includes the position of the target image in the corresponding video frame; the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
结合第五方面,在第五方面的一个可能的实施例中,摄像机不生成目标图像,收发模块不向存储设备发送目标图像。With reference to the fifth aspect, in a possible embodiment of the fifth aspect, the camera does not generate the target image, and the transceiver module does not send the target image to the storage device.
结合第五方面,在第五方面的一个可能的实施例中,还用于:With reference to the fifth aspect, in a possible embodiment of the fifth aspect, it is also used to:
从视频中选择目标图像所在的目标帧;Select the target frame where the target image is located from the video;
根据目标帧以及视频,获取目标图像在视频中的位置信息。According to the target frame and the video, the position information of the target image in the video is obtained.
结合第五方面,在第五方面的一个可能的实施例中,With reference to the fifth aspect, in a possible embodiment of the fifth aspect,
目标图像位于视频的图像组GOP中的I帧。The target image is located in the I frame in the GOP of the video.
结合第五方面,在第五方面的一个可能的实施例中,还用于:With reference to the fifth aspect, in a possible embodiment of the fifth aspect, it is also used to:
从多个原始视频帧中获取目标图像,目标图像的图像质量为多个原始视频帧中包括目 标特征的图像。The target image is obtained from multiple original video frames, and the image quality of the target image is the image including the target feature in the multiple original video frames.
第六方面,本申请实施例提供一种存储设备,该设备包括:In a sixth aspect, an embodiment of the present application provides a storage device, which includes:
收发模块,用于接收摄像机发送的视频以及目标图像的位置信息,其中,目标图像是视频中的视频帧或视频中视频帧的部分;The transceiver module is used to receive the video sent by the camera and the position information of the target image, where the target image is a video frame in the video or a part of the video frame in the video;
处理器,用于根据位置信息从视频的对应视频帧中获取目标图像。The processor is used to obtain the target image from the corresponding video frame of the video according to the position information.
结合第六方面,在第六方面的一个可能的实施例中,With reference to the sixth aspect, in a possible embodiment of the sixth aspect,
在目标图像的类别为大图的情况下,位置信息包括第一绝对位置和/或第一相对位置,其中,第一绝对位置包括目标图像在视频中的帧号以及时间戳中的一种或者多种,第一相对位置包括目标图像相对于特定视频帧的偏移量。In the case that the target image category is a large image, the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
结合第六方面,在第六方面的一个可能的实施例中,With reference to the sixth aspect, in a possible embodiment of the sixth aspect,
在目标图像的类别为小图的情况下,位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在视频中的帧号、时间戳中的一种或者多种,第二绝对位置还包括目标图像在对应的视频帧中的位置;第二相对位置包括目标图像相对于特定视频帧的偏移量,目标图像在对应的视频帧中的位置。When the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video. The second absolute position also includes the position of the target image in the corresponding video frame; the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
结合第六方面,在第六方面的一个可能的实施例中,在视频的存储生命周期内,当存储设备收到读取目标图像的请求,则存储设备根据位置信息从视频中对应的视频帧中获取目标图像。With reference to the sixth aspect, in a possible embodiment of the sixth aspect, during the storage life cycle of the video, when the storage device receives a request to read the target image, the storage device obtains the corresponding video frame from the video according to the location information. To obtain the target image.
结合第六方面,在第六方面的一个可能的实施例中,还用于:在视频的存储生命周期结束时,根据位置信息从视频的对应视频帧中获取目标图像;保存目标图像,以及删除视频。With reference to the sixth aspect, in a possible embodiment of the sixth aspect, it is also used to: at the end of the storage life cycle of the video, obtain the target image from the corresponding video frame of the video according to the location information; save the target image, and delete video.
结合第六方面,在第六方面的一个可能的实施例中,还用于:With reference to the sixth aspect, in a possible embodiment of the sixth aspect, it is also used to:
接收摄像机发送的视频流以及目标图像的位置信息,其中,目标图像是视频流中的视频帧或视频流中视频帧的部分;Receive the video stream sent by the camera and the location information of the target image, where the target image is a video frame in the video stream or a part of the video frame in the video stream;
根据位置信息从视频流的对应视频帧中获取目标图像。Obtain the target image from the corresponding video frame of the video stream according to the location information.
第七方面,本申请实施例提供一种摄像机,该摄像机包括:处理器、收发器和存储器,处理器执行存储器中的代码执行如第一方面的方法。In a seventh aspect, an embodiment of the present application provides a camera, which includes a processor, a transceiver, and a memory, and the processor executes the code in the memory to execute the method as in the first aspect.
第八方面,本申请实施例提供一种存储设备,该存储设备包括:处理器、收发器和存储器,处理器执行存储器中的代码执行如第二方面的方法。In an eighth aspect, an embodiment of the present application provides a storage device. The storage device includes a processor, a transceiver, and a memory, and the processor executes code in the memory to execute the method in the second aspect.
第九方面,本申请实施例提供一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序包括程序指令,程序指令当被处理器执行时使处理器执行如第一方面和第二方面任一项的方法。In a ninth aspect, an embodiment of the present application provides a computer-readable storage medium, and the computer-readable storage medium stores a computer program. The computer program includes program instructions. Any of the methods of the second aspect.
第十方面,提供了一种计算机程序产品,当计算机程序产品被计算机读取并执行时,如第一方面和第二方面任一项的方法将被执行。A tenth aspect provides a computer program product. When the computer program product is read and executed by a computer, the method of any one of the first aspect and the second aspect will be executed.
本申请的这些方面或其他方面在以下实施例的描述中会更加简明易懂。These and other aspects of the present application will be more concise and understandable in the description of the following embodiments.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例。In order to explain the embodiments of the present application or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings needed for the embodiments. Obviously, the drawings in the following description are only some implementations of the present application. example.
图1为本申请实施例提供了一种摄像机采集图像的示意图;FIG. 1 is a schematic diagram of a camera collecting images according to an embodiment of the application;
图2为本申请实施例提供了一种大图和小图的示意图;FIG. 2 provides a schematic diagram of a large image and a small image according to an embodiment of the application;
图3A为本申请实施例提供了一种对视频帧进行传输时的示意图;FIG. 3A provides a schematic diagram when a video frame is transmitted for an embodiment of this application;
图3B为本申请实施例提供了一种摄像机传输视频和位置信息的示意图;FIG. 3B provides a schematic diagram of video and location information transmitted by a camera according to an embodiment of this application;
图4为本申请实施例提供了一种存储设备存储视频和位置信息的示意图;FIG. 4 is a schematic diagram of a storage device storing video and location information according to an embodiment of this application;
图5A为本申请实施例提供了一种大图的索引表的示意图;FIG. 5A provides a schematic diagram of an index table of a large image according to an embodiment of this application;
图5B为本申请实施例提供了一种小图的索引表的示意图;FIG. 5B provides a schematic diagram of an index table of thumbnails according to an embodiment of this application;
图6A为本申请实施例提供了一种摄像机提取目标图像的示意图;FIG. 6A provides a schematic diagram of a camera extracting a target image according to an embodiment of this application;
图6B为本申请实施例提供了一种小图复用存储的索引表的示意图;FIG. 6B provides a schematic diagram of an index table for multiplex storage of thumbnails according to an embodiment of this application;
图6C为本申请实施例提供了另一种小图复用存储的索引表的示意图;FIG. 6C provides a schematic diagram of another index table for multiplex storage of small pictures according to an embodiment of this application;
图6D为本申请实施例提供了另一种大图的索引表的示意图;FIG. 6D provides a schematic diagram of an index table of another large image according to an embodiment of the present application;
图7为本申请实施例提供了一种媒体数据传输方法的交互示意图;FIG. 7 is an interactive schematic diagram of a media data transmission method according to an embodiment of the application;
图8为本申请实施例提供了一种媒体数据传输装置的结构示意图;FIG. 8 is a schematic structural diagram of a media data transmission device according to an embodiment of the application;
图9为本申请实施例提供了一种摄像机的结构示意图;FIG. 9 is a schematic structural diagram of a camera provided in an embodiment of the application;
图10为本申请实施例提供了另一种摄像机的结构示意图;FIG. 10 is a schematic structural diagram of another camera provided in an embodiment of this application;
图11为本申请实施例提供了一种媒体数据传输装置的结构示意图;FIG. 11 is a schematic structural diagram of a media data transmission device according to an embodiment of the application;
图12为本申请实施例提供了一种存储设备的结构示意图;FIG. 12 is a schematic structural diagram of a storage device provided in an embodiment of this application;
图13为本申请实施例提供了一种服务器的结构示意图。FIG. 13 is a schematic structural diagram of a server provided in an embodiment of this application.
具体实施方式detailed description
下面结合附图对本申请的实施例进行描述。The embodiments of the present application will be described below in conjunction with the drawings.
首先对本申请涉及的视频传输过程进行介绍。First, the video transmission process involved in this application will be introduced.
如图1所示,摄像机按照时间轴t将会采集到视频,其中,视频包括n帧图像I 1,I 2,…,I n,图像I 1是摄像机在t 1时刻采集到的图像,图像I 2是摄像机在t 2时刻采集到的图像,…,图像I n是摄像机在t n时刻采集到的图像。这里,t 1,t 2,…,t n之间的时间间隔可以是相等的,也可以是不相等的,也就是说,t n-t n-1,t n-1-t n-2,…,t 2-t 1可以是相等的,也可以是不相等的,此处不作具体限定。 As shown in Figure 1, the camera will collect video according to the time axis t, where the video includes n frames of images I 1 , I 2 ,..., I n , and the image I 1 is the image collected by the camera at t 1. I 2 is the image collected by the camera at time t 2 ,..., the image I n is the image collected by the camera at time t n. Here, the time interval between t 1 , t 2 ,..., t n can be equal or unequal, that is, t n -t n-1 , t n-1 -t n-2 ,..., t 2 -t 1 may be equal or unequal, and there is no specific limitation here.
摄像机除了需要将视频发送给存储设备之外,还需要从视频中选择目标图像发送给所述存储设备。其中,目标图像可以是大图,也可以是小图,存储设备可以是存储服务器、视频流管理平台等。如图2所示,大图可以是某个视频帧的完整图像,或者,占据某个视频帧的面积超过预设阈值的图像等等。在一具体的实施例中,大图可以包含多个目标主体(主体可以理解为目标特征),例如,大图可以包括车辆撞倒行人的场景,因此,大图可以用于分析不同目标主体之间的关系和行为。小图可以是某个视频帧的部分区域。在一具体的实施例中,小图可以只包括单个目标主体,或者,只包括单个目标主体的部分区域,例如,小图可以包括行人的人脸部分,因此,小图可以用于分析单个目标主体的细节以及结构。这里,目标主体可以是行人、动物、车辆、车牌、路标、红绿灯等等,此处不作具体限定。小图的获取方式可以是从大图中提取出具有目标主体的区域图像块,提取方法可以是图像特征提取算法,具体可以是HOG(histogram of Oriented Gradient,方向梯度直方图)、SIFT(Scale-invariant features transform,尺度不变特征变换)等,此处不作具体限定。In addition to sending the video to the storage device, the camera also needs to select a target image from the video and send it to the storage device. Among them, the target image can be a large image or a small image, and the storage device can be a storage server, a video stream management platform, and so on. As shown in Figure 2, the large image can be a complete image of a certain video frame, or an image that occupies an area of a certain video frame exceeding a preset threshold, and so on. In a specific embodiment, the big picture can contain multiple target subjects (the subject can be understood as target features). For example, the big picture can include a scene where a vehicle hits a pedestrian. Therefore, the big picture can be used to analyze the difference between different target subjects. Relationship and behavior. The thumbnail can be a partial area of a certain video frame. In a specific embodiment, the thumbnail may include only a single target subject, or only a partial area of a single target subject. For example, the thumbnail may include the face of a pedestrian. Therefore, the thumbnail can be used to analyze a single target. The details and structure of the main body. Here, the target subject can be pedestrians, animals, vehicles, license plates, road signs, traffic lights, etc., which are not specifically limited here. The small image can be obtained by extracting the regional image block with the target subject from the large image. The extraction method can be an image feature extraction algorithm, specifically HOG (histogram of Oriented Gradient, directional gradient histogram), SIFT (Scale- invariant features transform, scale-invariant feature transformation), etc., which are not specifically limited here.
为了减少摄像机向存储设备发送视频以及目标图像所需要占据的带宽,摄像机还可以采用视频压缩算法对视频进行压缩以及采用图片压缩算法对目标图像进行压缩,其中,视频压缩算法可以是H.264、H.265、H.266等,此处不作具体限定。图片压缩算法可以是JPEG、 HEIF等等,此处不作具体限定。In order to reduce the bandwidth required for the camera to send video and target images to the storage device, the camera can also use video compression algorithms to compress the video and image compression algorithms to compress the target image. The video compression algorithm can be H.264, H.265, H.266, etc., are not specifically limited here. The image compression algorithm can be JPEG, HEIF, etc., which is not specifically limited here.
在减少摄像机向存储设备发送数据所需要占据的带宽时,摄像机采用视频压缩算法对视频进行压缩以及采用图片压缩算法对目标图像进行压缩,压缩后带宽占用减少。但是在发送时仍然会对带宽带来传输资源的消耗,尤其是在传输大量的目标图像时,带宽资源的消耗仍然会是巨大的,同时在传输到存储设备后,存储设备需要消耗内存资源对目标图像进行存储,导致存储资源消耗较大。因此,如何减少目标图像的传输,是一个需要解决的问题。When reducing the bandwidth occupied by the camera to send data to the storage device, the camera uses a video compression algorithm to compress the video and a picture compression algorithm to compress the target image. After compression, the bandwidth occupation is reduced. However, the bandwidth will still be consumed during transmission, especially when a large number of target images are transmitted, the consumption of bandwidth resources will still be huge. At the same time, after the transmission to the storage device, the storage device needs to consume memory resources. The target image is stored, which leads to a large consumption of storage resources. Therefore, how to reduce the transmission of the target image is a problem that needs to be solved.
本申请实施例旨在解决上述的对目标图像进行传输时的较大带宽消耗以及存储设备存储时的内存资源、硬盘资源较大消耗,采用了对视频(视频可以用流的方式传输,称为视频流)和目标图像的位置信息进行传输的方式进行传输,由于位置信息的大小远远小于图像的大小,因此位置信息所占用的带宽也远远小于图像所占用的带宽,减少了带宽资源的消耗,在存储设备接收到视频和目标图像的位置信息后,存储设备存储视频和位置信息,存储位置信息相对于存储目标图像也能减少存储资源的消耗。The embodiment of the application aims to solve the above-mentioned large bandwidth consumption when transmitting the target image, and large consumption of memory resources and hard disk resources when storing the storage device. The video (video can be transmitted in a streaming mode, called Video stream) and the location information of the target image are transmitted. Since the size of the location information is much smaller than the size of the image, the bandwidth occupied by the location information is also much smaller than the bandwidth occupied by the image, which reduces bandwidth resources. Consumption: After the storage device receives the location information of the video and the target image, the storage device stores the video and location information. The storage location information can also reduce the consumption of storage resources compared to storing the target image.
本申请提供了一种媒体数据传输方法以及相关设备,能够有效减少带宽资源以及存储资源的消耗。This application provides a media data transmission method and related equipment, which can effectively reduce the consumption of bandwidth resources and storage resources.
在摄像机向存储设备发送视频,包括图像I 1,I 2,…,I n时,摄像机除了需要发送视频之外,还需要发送目标图像的位置信息,用于标记目标图像在视频中的位置。此处,视频和位置信息可以是同时传输的,也可以是分别进行传输的;可以用同一个通道传输,也可以用不同通道(例如都使用数据通道,或者视频流使用数据通道,位置信息使用管理通道),此处不作具体限定。下面分别对目标图像为大图和小图时的位置信息进行详细的介绍。 Send to the storage device in a video camera, comprising an image I 1, I 2, ..., when I n, in addition to the need to send the video camera, but also the location information transmitted target image, a target position mark in the video image. Here, the video and location information can be transmitted at the same time or separately; they can be transmitted on the same channel or on different channels (for example, both use data channels, or video streams use data channels, and location information uses Management channel), there is no specific limitation here. The following is a detailed introduction to the position information when the target image is a large image and a small image.
在目标图像的类别为大图的时候,目标图像的位置信息可以是第一绝对位置或第一相对位置,也可以同时包括第一绝对位置和第一相对位置。其中,第一绝对位置可以包括目标图像在视频中的帧号以及时间戳等等中的一种或者多种。例如,视频包括n帧图像I 1,I 2,…,I n,目标图像可以是第5帧图像,那么,第一绝对位置可以是目标图像在视频中的帧号5。第一相对位置可以是相对于某一特定视频帧的偏移量等。例如,视频包括n帧图像I 1,I 2,…,I n,目标图像可以是第5帧图像。5-1=4,那么,第一相对位置可以是目标图像相对于第一帧图像的偏移量4,也可以是,特定视频帧例如是对视频帧进行压缩后的I帧时,偏移量可以是目标图像相对于I帧的偏移量等。当然,当目标图像对应的视频帧为I帧时,位置信息可以采用第一绝对位置进行描述,具体可以为,位置信息为I帧的帧号等,目标图像帧为非I帧时,位置信息可以采用第一相对位置进行描述,具体可以为,目标图像相对于I帧的偏移量等。补充说明:当大图是不是完整的帧图像时,目标图像的位置信息还可以包括目标图像在视频帧中的位置,位置包括坐标和大小(具体参见下面描述)。 When the category of the target image is a large image, the position information of the target image may be the first absolute position or the first relative position, and may also include the first absolute position and the first relative position at the same time. The first absolute position may include one or more of the frame number and time stamp of the target image in the video. For example, a video image comprising n frames I 1, I 2, ..., I n, the target image may be an image frame 5, then the first absolute position of the target image may be a frame number 5 in the video. The first relative position may be an offset relative to a certain video frame, etc. For example, a video image comprising n frames I 1, I 2, ..., I n, the target image may be an image of the fifth frame. 5-1=4, then, the first relative position can be the offset 4 of the target image relative to the first frame of image, or it can be that when the specific video frame is, for example, the compressed I frame of the video frame, the offset The amount can be the offset of the target image relative to the I frame, and so on. Of course, when the video frame corresponding to the target image is an I frame, the position information can be described by using the first absolute position. Specifically, the position information can be the frame number of the I frame. When the target image frame is a non-I frame, the position information The first relative position may be used for description, which may specifically be the offset of the target image relative to the I frame. Supplementary note: When the large image is not a complete frame image, the position information of the target image may also include the position of the target image in the video frame, and the position includes coordinates and size (see the following description for details).
在目标图像的类别为小图的时候,目标图像的位置信息可以是第二绝对位置或第二相对位置,或者同时包括第二绝对位置与第二相对位置。其中,第二绝对位置可以包括目标图像对应的视频帧在视频中的帧号、时间戳等中的一种或者多种,以及目标图像在视频帧中的位置,位置包括坐标和大小,目标图像在视频帧的坐标可以表示为(x,y),x为横向坐标,y为纵向坐标,目标图像在视频帧的大小可以表示为mXn,m为横向大小,n为纵向大小。第二相对位置包括目标图像对应的视频帧的绝对位置和目标图像在对应的视频帧中的相对位置、目标图像对应的视频帧的相对位置和目标图像在对应的视频帧中的绝对位置、 目标图像对应的视频帧的相对位置和目标图像在对应的视频帧中的相对位置。目标图像在对应的视频帧中的相对位置可以是相对于某一特定标识的位置的偏移等,例如,视频帧为目标行人在参观天安门的图像,目标图像为目标行人,则目标图像在视频帧中的相对位置可以是目标行人相对于天安门的方向和距离进行,例如,目标行人在天安门东侧,距离100米的位置,表示为(东,100)。When the category of the target image is a small image, the position information of the target image may be the second absolute position or the second relative position, or include the second absolute position and the second relative position at the same time. Wherein, the second absolute position may include one or more of the frame number and time stamp of the video frame corresponding to the target image in the video, and the position of the target image in the video frame. The position includes coordinates and size. The target image The coordinates of the video frame can be expressed as (x, y), x is the horizontal coordinate, y is the vertical coordinate, the size of the target image in the video frame can be expressed as mXn, m is the horizontal size, and n is the vertical size. The second relative position includes the absolute position of the video frame corresponding to the target image and the relative position of the target image in the corresponding video frame, the relative position of the video frame corresponding to the target image and the absolute position of the target image in the corresponding video frame, the target image The relative position of the video frame corresponding to the image and the relative position of the target image in the corresponding video frame. The relative position of the target image in the corresponding video frame can be an offset relative to the position of a specific mark. For example, the video frame is the image of the target pedestrian visiting Tiananmen Square, and the target image is the target pedestrian, then the target image is in the video The relative position in the frame may be the direction and distance of the target pedestrian relative to Tiananmen Square. For example, the position of the target pedestrian on the east side of Tiananmen Square at a distance of 100 meters is expressed as (East, 100).
目标图像的位置信息还可以包括目标图像所在的视频帧在压缩后进行传输时的帧类别,帧类别包括I帧和P帧。在摄像机向存储设备发送视频,比如包括图像I 1,I 2,…,I n时,需要对视频帧进行编码并传输,如图3A所示,图3A示出了一种对视频帧进行传输时的示意图。视频帧在进行编码后,得到I帧和P帧,I帧可以是完整的视频帧,I帧可以理解为关键帧,P帧是这帧与之前的一个关键帧的差别,例如,I帧为一辆汽车在行驶时某一时刻的图像,则P帧可以为汽车在下一时刻时,汽车相对应该上一时刻的位置偏移等。目标图像可以为I帧,也可以是P帧。 The location information of the target image may also include the frame category of the video frame in which the target image is located during transmission after compression. The frame category includes I frame and P frame. Send to the storage device in a video camera, comprising an image such as I 1, I 2, ..., when I n, the need for a video frame coded and transmitted, as shown in FIG. 3A, FIG. 3A illustrates a video frame is transmitted Schematic diagram of time. After the video frame is encoded, I frame and P frame are obtained. I frame can be a complete video frame, I frame can be understood as a key frame, and P frame is the difference between this frame and the previous key frame. For example, I frame is An image of a certain moment when a car is driving, then the P frame can be the position offset of the car at the next moment relative to the previous moment, etc. The target image can be an I frame or a P frame.
摄像机在向存储设备发送视频和位置信息时,还可以发送目标图像的关联信息,关联信息包括视频的采集时间、目标图像对应的视频帧的采集时间、摄像机标识、目标图像的类别、目标图像的序号、目标图像对应的视频帧的帧号、目标图像对应的视频帧的偏移量。目标图像的类别包括大图和小图,采集视频的时间可以是视频的起始时间等。When the camera sends video and location information to the storage device, it can also send the associated information of the target image. The associated information includes the acquisition time of the video, the acquisition time of the video frame corresponding to the target image, the camera identification, the category of the target image, and the target image The sequence number, the frame number of the video frame corresponding to the target image, and the offset of the video frame corresponding to the target image. The target image categories include large images and small images, and the time when the video is collected can be the start time of the video, etc.
可以理解,在实际应用中,可以同时传输绝对位置和相对位置,以相互之间进行校验,避免因为视频帧在传输的过程中发生丢失,从而导致发生错误。It can be understood that in practical applications, the absolute position and the relative position can be transmitted at the same time to verify each other, so as to avoid the loss of the video frame during the transmission process, which may lead to errors.
在一具体的实施例中,如图3B所示,摄像机和存储设备之间传输的位置信息是经过压缩的。具体地,摄像机可以对原始的位置信息进行压缩,并将压缩后的位置信息发送给存储设备。原始位置信息也可以是上述位置信息的部分或全部位置信息。原始位置信息为上述位置信息中的部分位置信息时,至少包括目标图像的位置信息等。In a specific embodiment, as shown in FIG. 3B, the position information transmitted between the camera and the storage device is compressed. Specifically, the camera may compress the original position information, and send the compressed position information to the storage device. The original location information may also be part or all of the location information described above. When the original position information is part of the position information in the above position information, it includes at least the position information of the target image and so on.
如图4所示,图4示出了存储设备对接收到的视频和位置信息进行存储时的一个具体示例,存储设备在接收到视频和位置信息后,将视频存储到视频对应的存储空间,对位置信息进行存储。为了方便存储,也可以对位置信息进行进一步处理,得到目标图像的索引表,对索引表进行存储。位置信息(或者索引表)可以表征目标图片在视频中的位置信息,索引表中包括大图索引表和小图的索引表,大图的索引表和小图的索引表可以进行单独的存储,即,将所有大图索引表存储到一个内存空间中,将所有的小图索引表存储到另一个存储空间中。As shown in Figure 4, Figure 4 shows a specific example when the storage device stores the received video and location information. After receiving the video and location information, the storage device stores the video in the storage space corresponding to the video. Store location information. In order to facilitate storage, the position information can also be further processed to obtain an index table of the target image, and store the index table. The position information (or index table) can represent the position information of the target picture in the video. The index table includes the index table of the big picture and the index table of the small picture. The index table of the big picture and the index table of the small picture can be stored separately. That is, store all the large-picture index tables in one memory space, and store all the small-picture index tables in another storage space.
在对位置信息进行处理时,得到目标图像的索引表具体可以为:提取出目标图像的位置信息,并根据预设的索引表生成模板,生成目标图像的索引表。提取目标图像的位置信息时,可以从缓存中进行提取,也可以是从内存中进行提取。预设的索引表生成模板可以是预先设定的模板。当然也可以是通过其它方式,获取到目标图像的索引表。When the position information is processed, obtaining the index table of the target image may specifically be: extracting the position information of the target image, and generating a template according to the preset index table to generate the index table of the target image. When extracting the location information of the target image, it can be extracted from the cache or from the memory. The preset index table generation template may be a preset template. Of course, it is also possible to obtain the index table of the target image in other ways.
在一具体的实施例中,根据索引表模板生成的索引表时,图5A示出了一种大图的索引表的示意图,图5B示出了一种小图的索引表的示意图。如图5A所示,大图的索引表中包括有摄像机标识、大图对应的视频帧的采集时间、视频帧的帧号、图片类型、图片序号、视频帧类型、视频帧偏移等,上述索引表的内容可以直接从接收到的位置信息中进行提取得到。如图5B所示,小图的索引表包括有摄像机标识、小图所在的视频帧的采集时间、视频帧的帧号、图片类型、图片序号、视频帧类型、视频帧偏移、小图在视频帧中的偏移、 小图在视频帧中的大小等。小图在视频帧中的偏移通过坐标的形式进行表示,小图在视频帧中的大小通过横向大小和纵向大小进行表示,例如,80X80表示,横向大小为80,纵向大小为80,上述索引表的内容可以直接从接收到的位置信息中进行提取得到。In a specific embodiment, when the index table is generated according to the index table template, FIG. 5A shows a schematic diagram of an index table of a large image, and FIG. 5B shows a schematic diagram of an index table of a small image. As shown in Figure 5A, the index table of the big picture includes the camera ID, the collection time of the video frame corresponding to the big picture, the frame number of the video frame, the picture type, the picture sequence number, the video frame type, the video frame offset, etc. The content of the index table can be directly extracted from the received location information. As shown in Figure 5B, the index table of the thumbnail includes the camera ID, the collection time of the video frame in which the thumbnail is located, the frame number of the video frame, the picture type, the picture sequence number, the video frame type, the video frame offset, and the time when the thumbnail is located. The offset in the video frame, the size of the thumbnail in the video frame, etc. The offset of the thumbnail in the video frame is expressed in the form of coordinates. The size of the thumbnail in the video frame is expressed by the horizontal size and the vertical size. For example, 80X80 means that the horizontal size is 80 and the vertical size is 80. The above index The content of the table can be directly extracted from the received location information.
存储设备可以不用单独存储目标图像。当存储设备收到来自主机的读取目标图像的请求之后,再使用索引表从视频中获取相应的目标图像,发送给主机。The storage device does not need to store the target image separately. After the storage device receives the request from the host to read the target image, it uses the index table to obtain the corresponding target image from the video and sends it to the host.
存储设备在存储了视频后,对视频设置存储生命周期,在存储生命周期结束后,删除视频,存储生命周期具体可以理解为一个固定的时长。为了避免视频删除之后,目标图像无法再被读出,可以在视频的存储生命周期结束时,存储设备从视频中提取出目标图像,将目标图像存储至对应的存储空间,并更新图片索引表(更新后的索引表,用于描述目标图像在存储设备中的存储位置)。存储生命周期结束的时刻是触发从视频中提取目标图像的步骤的条件,在完成目标图像的提取后,存储设备删除视频。在具体时间点上,生命周期结束时包括:生命周期即将达到结束时间点,或者生命周期结束时间点之后的短时间内。当然,也可以在视频的存储生命周期结束前,存储设备从视频中提取出目标图像,比如:在存储生命周期结束前的10分钟之内执行完成提取操作并进行存储,当储生命周期结束之后,所述存储设备可以立即删除视频。After the storage device stores the video, it sets a storage life cycle for the video, and deletes the video after the storage life cycle ends. The storage life cycle can be specifically understood as a fixed duration. In order to avoid that the target image can no longer be read after the video is deleted, the storage device can extract the target image from the video at the end of the video storage life cycle, store the target image in the corresponding storage space, and update the picture index table ( The updated index table is used to describe the storage location of the target image in the storage device). The end of the storage life cycle is a condition that triggers the step of extracting the target image from the video. After the extraction of the target image is completed, the storage device deletes the video. At a specific point in time, the end of the life cycle includes: the life cycle is about to reach the end time point, or a short time after the end time point of the life cycle. Of course, the storage device can also extract the target image from the video before the end of the storage life cycle of the video, for example, perform the extraction operation and store it within 10 minutes before the end of the storage life cycle, and after the end of the storage life cycle , The storage device can delete the video immediately.
在根据目标图像的索引表从视频中提取出目标图像时,具体可以是:根据摄像机标识获取到与摄像机标识所对应的至少一个视频,然后根据视频的采集时间,从至少一个视频中确定出目标视频,目标视频包括目标图像,根据索引表中视频帧的时间、视频帧类型、视频帧在视频中的帧号、视频帧偏移量从目标视频中提取出视频帧,根据目标图像的位置信息,从视频帧中获取目标图像,若目标图像为大图,则可以将视频帧确定为目标图像(此处以大图为完整的视频帧为例进行说明),若目标图像为小图,则根据索引表中图片在视频帧中的偏移和图片在视频帧中的大小,获取到目标图像。在一个具体的示例中,如图6A所示,图6A示出了一种提取目标图像的示意图。存储设备根据摄像机标识从视频对应的存储空间中提出n个视频,视频1、视频2、…、视频n-1、视频n,根据摄像机标识从n个视频中确定出m个视频,视频k、…、视频j,根据视频的采集时间从所述m个视频中确定出目标视频,再根据索引表中的其它信息从目标视频的n个视频帧中确定出目标图像,其它信息包括视频帧的时间、视频帧类型、视频帧的帧号、视频帧偏移量、目标图像的位置信息(第一绝对位置和/或第一相对位置、第二绝对位置和/或第二相对位置)等。When extracting the target image from the video according to the index table of the target image, it may be specifically: acquiring at least one video corresponding to the camera identifier according to the camera identifier, and then determining the target from the at least one video according to the video acquisition time Video, the target video includes the target image. According to the time of the video frame in the index table, the type of the video frame, the frame number of the video frame in the video, and the video frame offset, the video frame is extracted from the target video, and the position information of the target image is extracted. , Obtain the target image from the video frame. If the target image is a large image, the video frame can be determined as the target image (here, the large image is a complete video frame as an example). If the target image is a small image, follow The offset of the picture in the video frame and the size of the picture in the video frame in the index table obtain the target image. In a specific example, as shown in FIG. 6A, FIG. 6A shows a schematic diagram of extracting a target image. The storage device proposes n videos from the storage space corresponding to the video according to the camera ID, video 1, video 2, ..., video n-1, video n, and determines m videos from n videos according to the camera ID, and video k, …, video j, the target video is determined from the m videos according to the video acquisition time, and then the target image is determined from the n video frames of the target video according to other information in the index table, and other information includes the information of the video frame Time, video frame type, frame number of video frame, video frame offset, position information of target image (first absolute position and/or first relative position, second absolute position and/or second relative position), etc.
在从视频中提取出目标图像时,首先判断视频中是否存在目标图像,若存在目标图像则根据目标图像的索引表从视频中提取出目标图像,并存储至图片存储空间。判断视频中是否存在目标图像,可以根据索引表进行判别,具体可以为,通过索引表中的视频帧的帧号和视频帧的采集时间来判别视频帧是否存在于视频中,若存在,则确定视频中存在目标图像,若不存在,则确定视频中不存在目标图像。When extracting a target image from a video, first determine whether there is a target image in the video. If there is a target image, extract the target image from the video according to the index table of the target image and store it in the image storage space. To determine whether there is a target image in the video, you can determine it according to the index table. Specifically, it can be determined by the frame number of the video frame in the index table and the acquisition time of the video frame to determine whether the video frame exists in the video. If it exists, then determine If there is a target image in the video, if it does not exist, it is determined that there is no target image in the video.
在对目标图像进行存储时,大图和小图可以单独存储,也可以复用存储。复用存储可以理解为,目标图像为小图时,存储目标图像所在的大图或视频帧,以及存储小图在大图或视频帧中的位置信息,从而实现了小图与大图/视频帧均被存储的效果。对目标图像进行存储时,可以采用不同的编码格式对目标图像进行编码后进行存储,编码格式可以是HEIF格式。When storing the target image, the large image and the small image can be stored separately or multiplexed. Multiplexing storage can be understood as when the target image is a small image, storing the large image or video frame where the target image is located, and storing the location information of the small image in the large image or video frame, thereby realizing the small image and the large image/video The effect that frames are stored. When storing the target image, different encoding formats may be adopted to encode the target image and then stored, and the encoding format may be the HEIF format.
在删除视频后,由于当前的目标图像的索引表中具体索引的位置为视频中的位置,则 当前的索引表已经不能满足对目标图像的位置进行表示的条件了,因此则需要对索引表进行更新,更新后的索引表用以描述目标图像的存储位置、目标图像的图片类型、目标图像的在存储时的编码格式等。After deleting the video, because the specific index position in the current target image index table is the position in the video, the current index table can no longer meet the conditions for representing the position of the target image, so it is necessary to perform the index table The updated index table is used to describe the storage location of the target image, the picture type of the target image, and the encoding format of the target image during storage.
不同的存储方式,更新后的图片索引表也会不同,更新后的图片索引表中新增目标图像的存储位置和目标图像的文件名,并删除原有与视频相关的位置信息,例如,视频帧的类型、视频帧的偏移量等。如图6B所示,图6B示出了小图进行复用存储时,小图的索引表的示意图,此时小图存储时复用了小图对应的视频帧。图6C示出了,小图进行复用存储,此时小图存储时复用了小图对应的大图,若大图为完整的视频帧,则小图在大图中的偏移量无需重新获取,若大图为部分视频帧,则小图在大图中的偏移量需要重新从进行获取,其具体获取方式参见前述获取小图在视频帧中的位置信息的获取方式,此处不再具体说明。图6D示出了,更新后的大图的索引表的示意图。其中,小图存储类型包括0和1,1表示小图复用存储,0表示小图单独存储。The image index table after the update will be different for different storage methods. The storage location of the target image and the file name of the target image are added to the updated image index table, and the original location information related to the video is deleted, for example, video The type of frame, the offset of the video frame, etc. As shown in FIG. 6B, FIG. 6B shows a schematic diagram of the index table of the thumbnail when the thumbnail is multiplexed and stored. At this time, the video frame corresponding to the thumbnail is multiplexed when the thumbnail is stored. Figure 6C shows that the small image is multiplexed and stored, and the large image corresponding to the small image is multiplexed when the small image is stored. If the large image is a complete video frame, the offset of the small image in the large image does not need to be Re-acquire. If the large image is part of the video frame, the offset of the small image in the large image needs to be obtained again. For the specific acquisition method, please refer to the aforementioned method of obtaining the position information of the small image in the video frame, here No more specific explanation. FIG. 6D shows a schematic diagram of the index table of the updated large image. Among them, the thumbnail storage types include 0 and 1, 1 means the thumbnail is multiplexed and stored, and 0 means the thumbnail is stored separately.
存储设备在存储了视频之后,如果接收到读取目标图像的请求时,则可以从视频中提取出目标图像或者从目标图像的存储空间中提取出目标图像,并向请求方反馈目标图像。具体可以为:读取目标图像的请求在视频的存储生命周期内,根据索引表从视频中提取出目标图像;读取目标图像的请求在视频的存储生命周期结束后,则根据索引表从目标图像的存储空间中提取出目标图像,提取出目标图像后,向请求方反馈目标图像,向请求方反馈目标图像时,可以将目标图像的图片格式转换为与请求方对应的图片格式,例如,请求方请求JPEG格式时,则将目标图像的格式转换为JPEG格式。从视频中提取出目标图像的方式可以参见前述实施例图6A所示的图像提取方法,此处不再赘述。After the storage device stores the video, if it receives a request to read the target image, it can extract the target image from the video or extract the target image from the storage space of the target image, and feed back the target image to the requesting party. Specifically, the request to read the target image is extracted from the video according to the index table during the storage life cycle of the video; the request to read the target image is retrieved from the target according to the index table after the video storage life cycle ends. The target image is extracted from the image storage space. After the target image is extracted, the target image is fed back to the requesting party. When the target image is fed back to the requesting party, the picture format of the target image can be converted to the picture format corresponding to the requesting party, for example, When the requester requests the JPEG format, the format of the target image is converted to the JPEG format. For the method of extracting the target image from the video, refer to the image extraction method shown in FIG. 6A of the foregoing embodiment, which will not be repeated here.
存储设备还可以接收摄像机发送的视频流和目标图像的位置信息,以及,存储设备根据位置信息从视频流中对应视频帧中获取目标图像,其具体的实施方式可以参照上述存储设备接收摄像机发送的视频和目标图像的位置信息,以及从视频中获取目标图像的实施方式,此处不再赘述。The storage device can also receive the video stream and the location information of the target image sent by the camera, and the storage device obtains the target image from the corresponding video frame in the video stream according to the location information. The location information of the video and the target image, as well as the implementation of obtaining the target image from the video, will not be repeated here.
如图7所示,图7为本申请实施例提供了一种媒体数据传输方法的交互示意图。本实施方式的数据传输方法,包括如下步骤:As shown in FIG. 7, FIG. 7 is an interactive schematic diagram of a media data transmission method provided in an embodiment of this application. The data transmission method of this embodiment includes the following steps:
S101、摄像机获取目标图像在视频中的位置信息,其中,目标图像是视频中的视频帧或视频中视频帧的部分。S101. The camera acquires position information of a target image in the video, where the target image is a video frame in the video or a part of the video frame in the video.
目标图像包括大图和/或小图,大图可以是某个视频帧的完整图像,或者,占据某个视频帧的面积超过预设阈值的图像等等;小图可以是某个视频帧的部分区域。在一具体的实施例中,小图可以只包括单个目标主体,或者,只包括单个目标主体的部分区域。The target image includes a large image and/or a small image. The large image can be a complete image of a certain video frame, or an image that occupies an area of a certain video frame that exceeds a preset threshold, etc.; the small image can be of a certain video frame partial area. In a specific embodiment, the thumbnail may only include a single target subject, or only a partial area of a single target subject.
位置信息包括绝对位置和相对位置,绝对位置例如可以是视频帧的帧号、时间戳等,相对位置例如可以是相对于特定视频帧的偏移量。The position information includes an absolute position and a relative position. The absolute position may be, for example, the frame number of a video frame, a time stamp, etc., and the relative position may be, for example, an offset relative to a specific video frame.
在执行获取目标图像在视频中的位置信息之前,摄像机生成多个原始视频帧,摄像机使用该多个原始视频帧生成视频。Before acquiring the position information of the target image in the video, the camera generates multiple original video frames, and the camera uses the multiple original video frames to generate the video.
S102、摄像机将视频以及位置信息发送给存储设备。S102. The camera sends the video and location information to the storage device.
摄像机将视频和位置信息发送给存储设备时,可以是同时发送,也可以是非同时发送。When the camera sends the video and location information to the storage device, it can be sent simultaneously or non-simultaneously.
S103、存储设备接收摄像机发送的视频以及目标图像的位置信息,其中,目标图像是视频中的视频帧或视频中视频帧的部分。S103: The storage device receives the video sent by the camera and the location information of the target image, where the target image is a video frame in the video or a part of the video frame in the video.
S104、存储设备根据位置信息从视频的对应视频帧中获取目标图像。S104. The storage device obtains a target image from a corresponding video frame of the video according to the location information.
存储设备根据位置信息获取目标图像时,可以是根据承载位置信息的索引表获取目标图像,具体可以是,根据索引表从视频中对应的视频帧中获取目标图像,也可以是根据索引表从目标图像的存储空间中获取目标图像。When the storage device obtains the target image according to the location information, it can obtain the target image according to the index table that carries the location information. Specifically, it can obtain the target image from the corresponding video frame in the video according to the index table, or it can obtain the target image from the target according to the index table. Obtain the target image from the image storage space.
在一个可能的实现方式中,In one possible implementation,
在目标图像的类别为大图的情况下,位置信息包括第一绝对位置和/或第一相对位置,其中,第一绝对位置包括目标图像在视频中的帧号以及时间戳中的一种或者多种,第一相对位置包括目标图像相对于特定视频帧的偏移量。In the case that the target image category is a large image, the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
在一个可能的实现方式中,In one possible implementation,
在目标图像的类别为小图的情况下,位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在视频中的帧号、时间戳中的一种或者多种,第二绝对位置还包括目标图像在对应的视频帧中位置,位置包括坐标和大小;第二相对位置包括目标图像相对于特定视频帧的偏移量,目标图像在对应的视频帧中的坐标和大小。When the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video. The second absolute position also includes the position of the target image in the corresponding video frame, and the position includes coordinates and size; the second relative position includes the offset of the target image relative to the specific video frame, and the target image is in the corresponding video frame. The coordinates and size of the video frame.
在一个可能的实施例中,摄像机不生成目标图像,不向存储设备发送目标图像。In a possible embodiment, the camera does not generate the target image, and does not send the target image to the storage device.
在一个可能的实施例中,还包括:In a possible embodiment, it further includes:
摄像机从视频中选择目标图像所在的目标帧;The camera selects the target frame where the target image is located from the video;
摄像机根据目标帧以及视频,获取目标图像在视频中的位置信息。The camera obtains the position information of the target image in the video according to the target frame and the video.
目标帧可以理解为前述实施例中目标图像所在的视频帧。The target frame can be understood as the video frame where the target image is located in the foregoing embodiment.
在一个可能的实施例中,In one possible embodiment,
目标图像位于视频的图像组GOP中的I帧。The target image is located in the I frame in the GOP of the video.
视频的图像组GOP中的I帧为关键帧。The I frame in the GOP of the video is a key frame.
在一个可能的实施例中,摄像机从多个原始视频帧中获取目标图像,目标图像的图像质量为多个原始视频帧中包括目标特征的图像。In a possible embodiment, the camera obtains a target image from a plurality of original video frames, and the image quality of the target image is an image including the target feature in the plurality of original video frames.
目标特征可以理解为特定的特征,例如,多个主体之间的行为等。The target feature can be understood as a specific feature, for example, the behavior among multiple subjects.
在一个可能的实施例中,还包括:存储设备接收摄像机发送的视频以及目标图像的位置信息;In a possible embodiment, it further includes: the storage device receives the video sent by the camera and the location information of the target image;
存储设备根据位置信息从视频的对应视频帧中获取目标图像。The storage device obtains the target image from the corresponding video frame of the video according to the location information.
在一个可能的实施例中,还包括:In a possible embodiment, it further includes:
存储设备存储摄像机发送的视频以及目标图像的位置信息;The storage device stores the video sent by the camera and the location information of the target image;
在视频的存储生命周期结束时,存储设备根据位置信息从视频的对应视频帧中获取目标图像;At the end of the storage life cycle of the video, the storage device obtains the target image from the corresponding video frame of the video according to the location information;
存储设备保存目标图像;The storage device saves the target image;
存储设备删除视频。The storage device deletes the video.
在一个可能的实施例中,还包括:In a possible embodiment, it further includes:
存储设备接收摄像机发送的视频流以及目标图像的位置信息,其中,目标图像是视频流中的视频帧或视频流中视频帧的部分;The storage device receives the video stream sent by the camera and the location information of the target image, where the target image is a video frame in the video stream or a part of the video frame in the video stream;
存储设备根据位置信息从视频流的对应视频帧中获取目标图像。The storage device obtains the target image from the corresponding video frame of the video stream according to the location information.
为了简便陈述,本实施例并没有大图、小图、位置信息、索引表等等的定义进行展开 描述,具体请参见图2、图3A、图3B、图5A以及图5B等以及相关的大图、小图、位置信息、索引表、特定视频帧的定义等等的描述。本实施例也没有对摄像机对视频的采集、视频的传输等进行介绍,具体请参见图1、图3A、图3B以及相关描述。其他的名词及释义请参见前述实施例中所描述的内容。For the sake of simplicity, the present embodiment does not have definitions of large images, small images, location information, index tables, etc., for expanded description. For details, please refer to Figure 2, Figure 3A, Figure 3B, Figure 5A, and Figure 5B, etc. and related large images. Descriptions of pictures, thumbnails, location information, index tables, definitions of specific video frames, etc. This embodiment also does not introduce the collection of video by the camera, the transmission of the video, etc. For details, please refer to FIG. 1, FIG. 3A, FIG. 3B and related descriptions. For other terms and definitions, please refer to the content described in the foregoing embodiment.
参见图8,图8是本申请中提供的一种媒体数据传输装置的结构示意图。本申请实施例的媒体数据传输装置800包括:Referring to FIG. 8, FIG. 8 is a schematic structural diagram of a media data transmission device provided in this application. The media data transmission device 800 of the embodiment of the present application includes:
第一生成单元810,用于生成多个原始视频帧;The first generating unit 810 is configured to generate multiple original video frames;
第二生成单元820,用于使用多个原始视频帧生成视频;The second generating unit 820 is configured to generate a video using multiple original video frames;
获取单元830,用于获取目标图像在视频中的位置信息,其中,目标图像是视频中的视频帧或视频中视频帧的部分;The obtaining unit 830 is configured to obtain location information of the target image in the video, where the target image is a video frame in the video or a part of the video frame in the video;
发送单元840,用于机将视频以及位置信息发送给存储设备。The sending unit 840 is used to send the video and location information to the storage device.
在一个可能的实施例中,In one possible embodiment,
在目标图像的类别为大图的情况下,位置信息包括第一绝对位置和/或第一相对位置,其中,第一绝对位置包括目标图像在视频中的帧号以及时间戳中的一种或者多种,第一相对位置包括目标图像相对于特定视频帧的偏移量;In the case that the target image category is a large image, the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or There are multiple types, the first relative position includes the offset of the target image relative to the specific video frame;
在一个可能的实施例中,In one possible embodiment,
在目标图像的类别为小图的情况下,位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在视频中的帧号、时间戳中的一种或者多种,第二绝对位置还包括目标图像在对应的视频帧中的位置;第二相对位置包括目标图像相对于特定视频帧的偏移量,目标图像在对应的视频帧中的位置。When the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video. The second absolute position also includes the position of the target image in the corresponding video frame; the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
在一个可能的实施例中,媒体数据传输装置不生成目标图像,不向存储设备发送目标图像。In a possible embodiment, the media data transmission apparatus does not generate the target image, and does not send the target image to the storage device.
在一个可能的实施例中,还用于:In a possible embodiment, it is also used for:
从视频中选择目标图像所在的目标帧;Select the target frame where the target image is located from the video;
根据目标帧以及视频,获取目标图像在视频中的位置信息。According to the target frame and the video, the position information of the target image in the video is obtained.
在一个可能的实施例中,In one possible embodiment,
目标图像位于视频的图像组GOP中的I帧。The target image is located in the I frame in the GOP of the video.
在一个可能的实施例中,还用于:从多个原始视频帧中获取目标图像,目标图像的图像质量为多个原始视频帧中包括目标特征的图像。In a possible embodiment, it is also used to: obtain a target image from multiple original video frames, and the image quality of the target image is an image including the target feature in the multiple original video frames.
参见图9,图9是本申请中提供的一种摄像机的结构示意图。本申请实施例的摄像机900包括处理器910和收发模块920,其中,Refer to FIG. 9, which is a schematic structural diagram of a camera provided in this application. The camera 900 in the embodiment of the present application includes a processor 910 and a transceiver module 920, where:
处理器910,用于生成多个原始视频帧,使用多个原始视频帧生成视频;获取目标图像在视频中的位置信息,其中,目标图像是视频中的视频帧或视频中视频帧的部分;The processor 910 is configured to generate multiple original video frames, and use the multiple original video frames to generate a video; obtain position information of the target image in the video, where the target image is a video frame in the video or a part of the video frame in the video;
收发模块920,用于将视频以及位置信息发送给存储设备。The transceiver module 920 is used to send the video and location information to the storage device.
在一个可能的实施例中,In one possible embodiment,
在目标图像的类别为大图的情况下,位置信息包括第一绝对位置和/或第一相对位置,其中,第一绝对位置包括目标图像在视频中的帧号以及时间戳中的一种或者多种,第一相对位置包括目标图像相对于特定视频帧的偏移量。In the case that the target image category is a large image, the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
在一个可能的实施例中,In one possible embodiment,
在目标图像的类别为小图的情况下,位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在视频中的帧号、时间戳中的一种或者多种,第二绝对位置还包括目标图像在对应的视频帧中的位置;第二相对位置包括目标图像相对于特定视频帧的偏移量,目标图像在对应的视频帧中的位置。When the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video. The second absolute position also includes the position of the target image in the corresponding video frame; the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
在一个可能的实施例中,处理器910不生成目标图像,收发模块920不向存储设备发送目标图像。In a possible embodiment, the processor 910 does not generate the target image, and the transceiver module 920 does not send the target image to the storage device.
在一个可能的实施例中,还用于:In a possible embodiment, it is also used for:
从视频中选择目标图像所在的目标帧;Select the target frame where the target image is located from the video;
根据目标帧以及视频,获取目标图像在视频中的位置信息。According to the target frame and the video, the position information of the target image in the video is obtained.
在一个可能的实施例中,In one possible embodiment,
目标图像位于视频的图像组GOP中的I帧。The target image is located in the I frame in the GOP of the video.
在一个可能的实施例中,还用于:从多个原始视频帧中获取目标图像,目标图像的图像质量为多个原始视频帧中包括目标特征的图像。In a possible embodiment, it is also used to: obtain a target image from multiple original video frames, and the image quality of the target image is an image including the target feature in the multiple original video frames.
如图10所示,本申请实施例还提供一种摄像机1000,该摄像机1000包括处理器1010,存储器1020与收发器1030,其中,存储器1020中存储指令或程序,处理器1010用于执行存储器1020中存储的指令或程序。存储器1020中存储的指令或程序被执行时,该处理器1010用于执行上述实施例中处理器920执行的操作,收发器1030用于执行上述实施例中收发模块902执行的操作。As shown in FIG. 10, an embodiment of the present application further provides a camera 1000. The camera 1000 includes a processor 1010, a memory 1020, and a transceiver 1030. The memory 1020 stores instructions or programs, and the processor 1010 is configured to execute the memory 1020. Instruction or program stored in. When the instructions or programs stored in the memory 1020 are executed, the processor 1010 is used to perform the operations performed by the processor 920 in the foregoing embodiment, and the transceiver 1030 is used to perform the operations performed by the transceiver module 902 in the foregoing embodiment.
参见图11,图11是本申请中提供的一种媒体数据传输装置的结构示意图。本申请实施例提供的媒体数据传输装置1100包括:Refer to FIG. 11, which is a schematic structural diagram of a media data transmission device provided in this application. The media data transmission device 1100 provided in the embodiment of the present application includes:
接收单元1110,用于接收摄像机发送的视频以及目标图像的位置信息,其中,目标图像是视频中的视频帧或视频中视频帧的部分;The receiving unit 1110 is configured to receive the video sent by the camera and the position information of the target image, where the target image is a video frame in the video or a part of the video frame in the video;
获取单元1120,用于根据位置信息从视频的对应视频帧中获取目标图像。The obtaining unit 1120 is configured to obtain a target image from a corresponding video frame of the video according to the location information.
在一个可能的实施例中,In one possible embodiment,
在目标图像的类别为大图的情况下,位置信息包括第一绝对位置和/或第一相对位置,其中,第一绝对位置包括目标图像在视频中的帧号以及时间戳中的一种或者多种,第一相对位置包括目标图像相对于特定视频帧的偏移量;In the case that the target image category is a large image, the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or There are multiple types, the first relative position includes the offset of the target image relative to the specific video frame;
在一个可能的实施例中,In one possible embodiment,
在目标图像的类别为小图的情况下,位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在视频中的帧号、时间戳中的一种或者多种,第二绝对位置还包括目标图像在对应的视频帧中的位置;第二相对位置包括目标图像相对于特定视频帧的偏移量,目标图像在对应的视频帧中的位置。When the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video. The second absolute position also includes the position of the target image in the corresponding video frame; the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
在一个可能的实施例中,在视频的存储生命周期内,当收到读取目标图像的请求,则根据位置信息从视频中对应的视频帧中获取目标图像。In a possible embodiment, during the storage life cycle of the video, when a request to read the target image is received, the target image is obtained from the corresponding video frame in the video according to the location information.
在一个可能的实施例中,还用于:在视频的存储生命周期结束时,根据位置信息从视频的对应视频帧中获取目标图像;保存目标图像,以及删除视频。In a possible embodiment, it is also used to: at the end of the storage life cycle of the video, obtain the target image from the corresponding video frame of the video according to the location information; save the target image, and delete the video.
在一个可能的实施例中,还用于:In a possible embodiment, it is also used for:
接收摄像机发送的视频流以及目标图像的位置信息,其中,目标图像是视频流中的视频帧或视频流中视频帧的部分;Receive the video stream sent by the camera and the location information of the target image, where the target image is a video frame in the video stream or a part of the video frame in the video stream;
根据位置信息从视频流的对应视频帧中获取目标图像。Obtain the target image from the corresponding video frame of the video stream according to the location information.
参见图12,图12是本申请中提供的一种存储设备的结构示意图。本申请实施例提供的存储设备1200包括收发模块1210和处理器1220:Refer to FIG. 12, which is a schematic structural diagram of a storage device provided in this application. The storage device 1200 provided in the embodiment of the present application includes a transceiver module 1210 and a processor 1220:
收发模块1210,用于接收摄像机发送的视频以及目标图像的位置信息,其中,目标图像是视频中的视频帧或视频中视频帧的部分;The transceiver module 1210 is used to receive the video sent by the camera and the location information of the target image, where the target image is a video frame in the video or a part of the video frame in the video;
处理器1220,用于根据位置信息从视频的对应视频帧中获取目标图像。The processor 1220 is configured to obtain a target image from a corresponding video frame of the video according to the location information.
在一个可能的实施例中,In one possible embodiment,
在目标图像的类别为大图的情况下,位置信息包括第一绝对位置和/或第一相对位置,其中,第一绝对位置包括目标图像在视频中的帧号以及时间戳中的一种或者多种,第一相对位置包括目标图像相对于特定视频帧的偏移量。In the case that the target image category is a large image, the position information includes the first absolute position and/or the first relative position, where the first absolute position includes one of the frame number and the time stamp of the target image in the video, or In many cases, the first relative position includes the offset of the target image relative to the specific video frame.
在一个可能的实施例中,In one possible embodiment,
在目标图像的类别为小图的情况下,位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在视频中的帧号、时间戳中的一种或者多种,第二绝对位置还包括目标图像在对应的视频帧中的位置;第二相对位置包括目标图像相对于特定视频帧的偏移量,目标图像在对应的视频帧中的位置。When the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the frame number and timestamp of the video frame corresponding to the target image in the video. The second absolute position also includes the position of the target image in the corresponding video frame; the second relative position includes the offset of the target image relative to the specific video frame, and the position of the target image in the corresponding video frame position.
在一个可能的实施例中,在视频的存储生命周期内,当存储设备1200收到读取目标图像的请求,则存储设备1200根据位置信息从视频中对应的视频帧中获取目标图像。In a possible embodiment, during the storage life cycle of the video, when the storage device 1200 receives a request to read the target image, the storage device 1200 obtains the target image from the corresponding video frame in the video according to the location information.
在一个可能的实施例中,还用于:在视频的存储生命周期结束时,根据位置信息从视频的对应视频帧中获取目标图像;保存目标图像,以及删除视频。In a possible embodiment, it is also used to: at the end of the storage life cycle of the video, obtain the target image from the corresponding video frame of the video according to the location information; save the target image, and delete the video.
在一个可能的实施例中,还用于:In a possible embodiment, it is also used for:
接收摄像机发送的视频流以及目标图像的位置信息,其中,目标图像是视频流中的视频帧或视频流中视频帧的部分;Receive the video stream sent by the camera and the location information of the target image, where the target image is a video frame in the video stream or a part of the video frame in the video stream;
根据位置信息从视频流的对应视频帧中获取目标图像。Obtain the target image from the corresponding video frame of the video stream according to the location information.
如图13所示,本申请实施例还提供一种服务器1300,该服务器1300包括处理器1310,存储器1320与收发器1330,其中,存储器1320中存储指令或程序,处理器1310用于执行存储器1320中存储的指令或程序。存储器1320中存储的指令或程序被执行时,该处理器1310用于执行上述实施例中处理器1220执行的操作,收发器1330用于执行上述实施例中收发模块1210执行的操作。As shown in FIG. 13, an embodiment of the present application further provides a server 1300. The server 1300 includes a processor 1310, a memory 1320, and a transceiver 1330. The memory 1320 stores instructions or programs, and the processor 1310 is configured to execute the memory 1320. Instruction or program stored in. When the instructions or programs stored in the memory 1320 are executed, the processor 1310 is used to perform the operations performed by the processor 1220 in the foregoing embodiment, and the transceiver 1330 is used to perform the operations performed by the transceiver module 1210 in the foregoing embodiment.
本申请实施例还提供一种计算机可读存储介质,其中,该计算机可读存储介质可存储有程序,该程序执行时包括上述方法实施例中记载的任何一种媒体数据传输方法的部分或全部步骤。The embodiments of the present application also provide a computer-readable storage medium, wherein the computer-readable storage medium can store a program, and when the program is executed, it includes part or all of any of the media data transmission methods described in the above method embodiments. step.
本申请实施例还提供一种程序产品,其中,当计算机程序产品被计算机读取并执行时,上述方法实施例中记载的任何一种媒体数据传输方法的部分或全部步骤将被执行。The embodiments of the present application also provide a program product, wherein when the computer program product is read and executed by a computer, part or all of the steps of any media data transmission method recorded in the above method embodiments will be executed.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本申请实施例的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从 一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、存储盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态存储盘Solid State Disk(SSD))等。In the above-mentioned embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line) or wireless (such as infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media. The usable medium may be a magnetic medium, (for example, a floppy disk, a storage disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a Solid State Disk (SSD)).
以上对本申请实施例进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上上述,本说明书内容不应理解为对本申请的限制。The embodiments of the application are described in detail above, and specific examples are used in this article to illustrate the principles and implementation of the application. The descriptions of the above embodiments are only used to help understand the methods and core ideas of the application; at the same time, for A person of ordinary skill in the art, based on the idea of the application, will have changes in the specific implementation and the scope of application. In summary, the content of this specification should not be construed as a limitation to the application.

Claims (25)

  1. 一种媒体数据传输方法,其特征在于,所述方法包括:A media data transmission method, characterized in that the method includes:
    摄像机生成多个原始视频帧;The camera generates multiple original video frames;
    所述摄像机使用所述多个原始视频帧生成视频;The camera generates a video using the multiple original video frames;
    所述摄像机获取目标图像在视频中的位置信息,其中,所述目标图像是所述视频中的视频帧或所述视频中视频帧的部分;Acquiring, by the camera, position information of a target image in a video, wherein the target image is a video frame in the video or a part of a video frame in the video;
    所述摄像机将所述视频以及所述位置信息发送给存储设备。The camera sends the video and the location information to a storage device.
  2. 根据权利要求1所述的方法,其特征在于,The method of claim 1, wherein:
    在所述目标图像的类别为大图的情况下,所述位置信息包括第一绝对位置和/或第一相对位置,其中,所述第一绝对位置包括所述目标图像在所述视频中的帧号以及时间戳中的一种或者多种,所述第一相对位置包括所述目标图像相对于特定视频帧的偏移量。In the case that the category of the target image is a large image, the position information includes a first absolute position and/or a first relative position, wherein the first absolute position includes the position of the target image in the video One or more of a frame number and a time stamp, and the first relative position includes an offset of the target image relative to a specific video frame.
  3. 根据权利要求1所述的方法,其特征在于,The method of claim 1, wherein:
    在所述目标图像的类别为小图的情况下,所述位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在所述视频中的帧号、时间戳中的一种或者多种,所述第二绝对位置还包括目标图像在所述对应的视频帧中的位置;所述第二相对位置包括所述目标图像相对于特定视频帧的偏移量,所述目标图像在对应的视频帧中的位置。In the case that the category of the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the position of the video frame corresponding to the target image in the video. One or more of a frame number and a time stamp, the second absolute position further includes the position of the target image in the corresponding video frame; the second relative position includes the target image relative to a specific video frame , The position of the target image in the corresponding video frame.
  4. 根据权利要求1至3任一权利要求所述的方法,其特征在于,所述摄像机不生成所述目标图像,不向所述存储设备发送所述目标图像。The method according to any one of claims 1 to 3, wherein the camera does not generate the target image, and does not send the target image to the storage device.
  5. 根据权利要求1至4任一权利要求所述的方法,其特征在于,还包括:The method according to any one of claims 1 to 4, further comprising:
    所述摄像机从所述视频中选择所述目标图像所在的目标帧;The camera selects the target frame where the target image is located from the video;
    所述摄像机根据所述目标帧以及所述视频,获取所述目标图像在所述视频中的位置信息。The camera acquires position information of the target image in the video according to the target frame and the video.
  6. 根据权利要求1至5任一权利要求所述的方法,其特征在于,The method according to any one of claims 1 to 5, wherein:
    所述目标图像位于所述视频的图像组GOP中的I帧。The target image is located in an I frame in the group of pictures GOP of the video.
  7. 根据权利要求1至5任一权利要求所述的方法,其特征在于,还包括:The method according to any one of claims 1 to 5, further comprising:
    所述摄像机从所述多个原始视频帧中获取目标图像,所述目标图像为所述多个原始视频帧中包括目标特征的图像。The camera acquires a target image from the plurality of original video frames, and the target image is an image including a target feature in the plurality of original video frames.
  8. 根据权利要求1至5任一权利要求所述的方法,其特征在于,还包括:所述存储设备接收所述摄像机发送的视频以及目标图像的位置信息;The method according to any one of claims 1 to 5, further comprising: the storage device receiving the video sent by the camera and the location information of the target image;
    所述存储设备根据所述位置信息从所述视频的对应视频帧中获取所述目标图像。The storage device obtains the target image from the corresponding video frame of the video according to the location information.
  9. 根据权利要求1至5任一权利要求所述的方法,其特征在于,还包括:The method according to any one of claims 1 to 5, further comprising:
    所述存储设备存储所述摄像机发送的视频以及目标图像的位置信息;The storage device stores the video sent by the camera and the location information of the target image;
    在所述视频的存储生命周期结束时,所述存储设备根据所述位置信息从所述视频的对应视频帧中获取所述目标图像;When the storage life cycle of the video ends, the storage device obtains the target image from a corresponding video frame of the video according to the location information;
    所述存储设备保存所述目标图像;The storage device saves the target image;
    所述存储设备删除所述视频。The storage device deletes the video.
  10. 根据权利要求1至3任一权利要求所述的方法,其特征在于,还包括:The method according to any one of claims 1 to 3, further comprising:
    所述存储设备接收所述摄像机发送的视频流以及目标图像的位置信息,其中,所述 目标图像是所述视频流中的视频帧或所述视频流中视频帧的部分;The storage device receives a video stream sent by the camera and location information of a target image, where the target image is a video frame in the video stream or a part of a video frame in the video stream;
    所述存储设备根据所述位置信息从所述视频流的对应视频帧中获取所述目标图像。The storage device obtains the target image from the corresponding video frame of the video stream according to the location information.
  11. 一种媒体数据传输方法,其特征在于,所述方法包括:A media data transmission method, characterized in that the method includes:
    存储设备接收摄像机发送的视频以及目标图像的位置信息,其中,所述目标图像是所述视频中的视频帧或所述视频中视频帧的部分;The storage device receives the video sent by the camera and the location information of the target image, where the target image is a video frame in the video or a part of a video frame in the video;
    所述存储设备根据所述位置信息从所述视频的对应视频帧中获取所述目标图像。The storage device obtains the target image from the corresponding video frame of the video according to the location information.
  12. 根据权利要求11所述的方法,其特征在于,The method of claim 11, wherein:
    在所述目标图像的类别为大图的情况下,所述位置信息包括第一绝对位置和/或第一相对位置,其中,所述第一绝对位置包括所述目标图像在所述视频中的帧号以及时间戳中的一种或者多种,所述第一相对位置包括所述目标图像相对于特定视频帧的偏移量。In the case that the category of the target image is a large image, the position information includes a first absolute position and/or a first relative position, wherein the first absolute position includes the position of the target image in the video One or more of a frame number and a time stamp, and the first relative position includes an offset of the target image relative to a specific video frame.
  13. 根据权利要求11所述的方法,其特征在于,The method of claim 11, wherein:
    在所述目标图像的类别为小图的情况下,所述位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在所述视频中的帧号、时间戳中的一种或者多种,所述第二绝对位置还包括目标图像在所述对应的视频帧中的位置;所述第二相对位置包括所述目标图像相对于特定视频帧的偏移量,所述目标图像在对应的视频帧中的位置。In the case that the category of the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the position of the video frame corresponding to the target image in the video. One or more of a frame number and a time stamp, the second absolute position further includes the position of the target image in the corresponding video frame; the second relative position includes the target image relative to a specific video frame , The position of the target image in the corresponding video frame.
  14. 根据权利要求11至13所述的方法,其特征在于,在所述视频的存储生命周期内,当所述存储设备收到读取所述目标图像的请求,则所述存储设备根据所述位置信息从所述视频中对应的视频帧中获取所述目标图像。The method according to claims 11 to 13, characterized in that, during the storage life cycle of the video, when the storage device receives a request to read the target image, the storage device is based on the location The information obtains the target image from the corresponding video frame in the video.
  15. 根据权利要求11-14所述的方法,其特征在于,还包括:在所述视频的存储生命周期结束时,所述存储设备根据所述位置信息从所述视频的对应视频帧中获取所述目标图像;所述存储设备保存所述目标图像,以及删除所述视频。The method according to claims 11-14, further comprising: at the end of the storage life cycle of the video, the storage device obtains the video from the corresponding video frame of the video according to the location information. Target image; the storage device saves the target image, and deletes the video.
  16. 一种媒体数据传输装置,其特征在于,所述装置包括:A media data transmission device, characterized in that the device includes:
    第一生成单元,用于生成多个原始视频帧;The first generating unit is used to generate multiple original video frames;
    第二生成单元,用于使用所述多个原始视频帧生成视频;The second generating unit is configured to generate a video using the multiple original video frames;
    获取单元,用于获取目标图像在视频中的位置信息,其中,所述目标图像是所述视频中的视频帧或所述视频中视频帧的部分;An acquiring unit, configured to acquire position information of a target image in a video, wherein the target image is a video frame in the video or a part of a video frame in the video;
    发送单元,用于机将所述视频以及所述位置信息发送给存储设备。The sending unit is used to send the video and the location information to the storage device.
  17. 根据权利要求16所述的装置,其特征在于,The device of claim 16, wherein:
    在所述目标图像的类别为大图的情况下,所述位置信息包括第一绝对位置和/或第一相对位置,其中,所述第一绝对位置包括所述目标图像在所述视频中的帧号以及时间戳中的一种或者多种,所述第一相对位置包括所述目标图像相对于特定视频帧的偏移量。In the case that the category of the target image is a large image, the position information includes a first absolute position and/or a first relative position, wherein the first absolute position includes the position of the target image in the video One or more of a frame number and a time stamp, and the first relative position includes an offset of the target image relative to a specific video frame.
  18. 根据权利要求16所述的装置,其特征在于,The device of claim 16, wherein:
    在所述目标图像的类别为小图的情况下,所述位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在所述视频中的帧号、时间戳中的一种或者多种,所述第二绝对位置还包括目标图像在所述对应的视频帧中的位置;所述第二相对位置包括所述目标图像相对于特定视频帧的偏移量,所述目标图像在对应的视频帧中的位置。In the case that the category of the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the position of the video frame corresponding to the target image in the video. One or more of a frame number and a time stamp, the second absolute position further includes the position of the target image in the corresponding video frame; the second relative position includes the target image relative to a specific video frame , The position of the target image in the corresponding video frame.
  19. 根据权利要求16至18任一权利要求所述的装置,其特征在于,所述媒体数据传输装置不生成所述目标图像,不向所述存储设备发送所述目标图像。The apparatus according to any one of claims 16 to 18, wherein the media data transmission apparatus does not generate the target image, and does not send the target image to the storage device.
  20. 一种摄像机,其特征在于,所述摄像机包括:A camera, characterized in that the camera comprises:
    处理器,用于生成多个原始视频帧,使用所述多个原始视频帧生成视频;获取目标图像在视频中的位置信息,其中,所述目标图像是所述视频中的视频帧或所述视频中视频帧的部分;The processor is configured to generate a plurality of original video frames, and use the plurality of original video frames to generate a video; obtain position information of a target image in the video, wherein the target image is a video frame in the video or the The part of the video frame in the video;
    收发模块,用于将所述视频以及所述位置信息发送给存储设备。The transceiver module is used to send the video and the location information to the storage device.
  21. 根据权利要求20所述的摄像机,其特征在于,The camera of claim 20, wherein:
    在所述目标图像的类别为大图的情况下,所述位置信息包括第一绝对位置和/或第一相对位置,其中,所述第一绝对位置包括所述目标图像在所述视频中的帧号以及时间戳中的一种或者多种,所述第一相对位置包括所述目标图像相对于特定视频帧的偏移量。In the case that the category of the target image is a large image, the position information includes a first absolute position and/or a first relative position, wherein the first absolute position includes the position of the target image in the video One or more of a frame number and a time stamp, and the first relative position includes an offset of the target image with respect to a specific video frame.
  22. 根据权利要求20所述的摄像机,其特征在于,The camera of claim 20, wherein:
    在所述目标图像的类别为小图的情况下,所述位置信息包括第二绝对位置和/或第二相对位置,其中,第二绝对位置包括目标图像对应的视频帧在所述视频中的帧号、时间戳中的一种或者多种,所述第二绝对位置还包括目标图像在所述对应的视频帧中的位置;所述第二相对位置包括所述目标图像相对于特定视频帧的偏移量,所述目标图像在对应的视频帧中的位置。In the case that the category of the target image is a small image, the position information includes a second absolute position and/or a second relative position, where the second absolute position includes the position of the video frame corresponding to the target image in the video. One or more of a frame number and a time stamp, the second absolute position further includes the position of the target image in the corresponding video frame; the second relative position includes the target image relative to a specific video frame , The position of the target image in the corresponding video frame.
  23. 根据权利要求20至22任一权利要求所述的摄像机,其特征在于,所述摄像机不生成所述目标图像,不向所述存储设备发送所述目标图像。The camera according to any one of claims 20 to 22, wherein the camera does not generate the target image, and does not send the target image to the storage device.
  24. 一种存储设备,其特征在于,所述设备包括:A storage device, characterized in that the device includes:
    收发模块,用于接收摄像机发送的视频以及目标图像的位置信息,其中,所述目标图像是所述视频中的视频帧或所述视频中视频帧的部分;The transceiver module is used to receive the video sent by the camera and the position information of the target image, wherein the target image is a video frame in the video or a part of a video frame in the video;
    处理器,用于根据所述位置信息从所述视频的对应视频帧中获取所述目标图像。The processor is configured to obtain the target image from the corresponding video frame of the video according to the location information.
  25. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求1-15任一项所述的方法。A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program includes program instructions that when executed by a processor cause the processor to execute The method of any one of 1-15 is required.
PCT/CN2020/097302 2019-09-19 2020-06-20 Media data transmission method and related device WO2021051912A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201910888290.0 2019-09-19
CN201910888290 2019-09-19
CN202010051951.7A CN111263097B (en) 2019-09-19 2020-01-16 Media data transmission method and related equipment
CN202010051951.7 2020-01-16

Publications (1)

Publication Number Publication Date
WO2021051912A1 true WO2021051912A1 (en) 2021-03-25

Family

ID=70949290

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/097302 WO2021051912A1 (en) 2019-09-19 2020-06-20 Media data transmission method and related device

Country Status (2)

Country Link
CN (1) CN111263097B (en)
WO (1) WO2021051912A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111263097B (en) * 2019-09-19 2024-01-02 华为技术有限公司 Media data transmission method and related equipment
CN111818300B (en) * 2020-06-16 2022-05-27 浙江大华技术股份有限公司 Data storage method, data query method, data storage device, data query device, computer equipment and storage medium
CN112541429B (en) * 2020-12-08 2024-05-31 浙江大华技术股份有限公司 Intelligent image capture method and device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102148983A (en) * 2010-02-08 2011-08-10 杨占昆 Method for solving over-high occupancy of high-resolution image resource
US20150181207A1 (en) * 2013-12-20 2015-06-25 Vmware, Inc. Measuring Remote Video Display with Embedded Pixels
CN107277081A (en) * 2016-04-06 2017-10-20 北京优朋普乐科技有限公司 Section method for down loading and device, the stream media system of stream medium data
CN107992366A (en) * 2017-12-26 2018-05-04 网易(杭州)网络有限公司 Method, system and the electronic equipment that multiple destination objects are detected and tracked
CN109756749A (en) * 2017-11-07 2019-05-14 阿里巴巴集团控股有限公司 Video data handling procedure, device, server and storage medium
CN111263097A (en) * 2019-09-19 2020-06-09 华为技术有限公司 Media data transmission method and related equipment

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004048512A (en) * 2002-07-12 2004-02-12 Renesas Technology Corp Moving picture encoding method and moving picture encoding circuit
JP2005275765A (en) * 2004-03-24 2005-10-06 Seiko Epson Corp Image processor, image processing method, image processing program and recording medium recording the program
JP2009246642A (en) * 2008-03-31 2009-10-22 Kddi Corp Video transmission device, video display and video transmission system
CN103051865B (en) * 2012-12-28 2016-03-30 华为技术有限公司 The method that picture controls and terminal, video conference device
CN103870574B (en) * 2014-03-18 2017-03-08 江苏物联网研究发展中心 Forming label based on the storage of H.264 ciphertext cloud video and indexing means
CN105681749A (en) * 2016-01-12 2016-06-15 上海小蚁科技有限公司 Method, device and system for previewing videos and computer readable media
CN106803936B (en) * 2017-02-24 2020-01-07 深圳英飞拓科技股份有限公司 Video snapshot method and device based on memory coding mechanism
CN109218656B (en) * 2017-06-30 2021-03-26 杭州海康威视数字技术股份有限公司 Image display method, device and system
KR20190090917A (en) * 2018-01-26 2019-08-05 주식회사 삼알글로벌 Video watch apparatus and video watch method
CN109040587A (en) * 2018-08-01 2018-12-18 北京旷视科技有限公司 It captures processing method, device, capture mechanism, equipment and storage medium
CN109358315B (en) * 2018-10-12 2020-08-18 华中科技大学 Auxiliary target indirect positioning method and system
CN109359596A (en) * 2018-10-18 2019-02-19 上海电科市政工程有限公司 A kind of highway vehicle localization method fast and accurately
CN109783680B (en) * 2019-01-16 2021-03-23 北京旷视科技有限公司 Image pushing method, image acquisition device and image processing system
CN110210385B (en) * 2019-05-31 2021-12-21 广东小天才科技有限公司 Article tracking method, apparatus, system and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102148983A (en) * 2010-02-08 2011-08-10 杨占昆 Method for solving over-high occupancy of high-resolution image resource
US20150181207A1 (en) * 2013-12-20 2015-06-25 Vmware, Inc. Measuring Remote Video Display with Embedded Pixels
CN107277081A (en) * 2016-04-06 2017-10-20 北京优朋普乐科技有限公司 Section method for down loading and device, the stream media system of stream medium data
CN109756749A (en) * 2017-11-07 2019-05-14 阿里巴巴集团控股有限公司 Video data handling procedure, device, server and storage medium
CN107992366A (en) * 2017-12-26 2018-05-04 网易(杭州)网络有限公司 Method, system and the electronic equipment that multiple destination objects are detected and tracked
CN111263097A (en) * 2019-09-19 2020-06-09 华为技术有限公司 Media data transmission method and related equipment

Also Published As

Publication number Publication date
CN111263097B (en) 2024-01-02
CN111263097A (en) 2020-06-09

Similar Documents

Publication Publication Date Title
WO2021051912A1 (en) Media data transmission method and related device
CN111147955B (en) Video playing method, server and computer readable storage medium
CN110446062B (en) Receiving processing method for big data file transmission, electronic device and storage medium
KR101964126B1 (en) The Apparatus And Method For Transferring High Definition Video
CN111787398A (en) Video compression method, device, equipment and storage device
CN113225585A (en) Video definition switching method and device, electronic equipment and storage medium
CN104243886B (en) A kind of high speed image parsing and video generation method based on plug-in part technology
US9219795B2 (en) Moving picture file transmitting server and method of controlling operation of same
WO2011135521A1 (en) Methods and apparatuses for facilitating remote data processing
US11095901B2 (en) Object manipulation video conference compression
TWI680668B (en) Screen image transmission method, image restoration method, screen image transmission system, image restoration system, screen image transmission program, image restoration program, image compression method, image compression system, and image compression program
CN102118633B (en) Method, device and system for playing video files
WO2022057773A1 (en) Image storage method and apparatus, computer device and storage medium
Zhang et al. Understanding the potential of server-driven edge video analytics
JPWO2015132885A1 (en) Movie compression apparatus and movie compression / decompression system
CN113691815A (en) Video data processing method, device and computer readable storage medium
CN114626994A (en) Image processing method, video processing method, computer equipment and storage medium
JP2006195807A (en) Image search system, image search method, and program
WO2021237464A1 (en) Video image processing method and device
CN115914738B (en) Video generation method, device, server and storage medium
CN114710474B (en) Data stream processing and classifying method based on Internet of things
CN117435112B (en) Data processing method, system and device, electronic equipment and storage medium
US11895332B2 (en) Server device, communication system, and computer-readable medium
CN101998125B (en) Image document transmission system and method
CN106210745A (en) A kind of intelligent jpeg image coding/decoding system and method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20866284

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20866284

Country of ref document: EP

Kind code of ref document: A1