CN117710893B - Multidimensional digital image intelligent campus digitizing system - Google Patents
Multidimensional digital image intelligent campus digitizing system Download PDFInfo
- Publication number
- CN117710893B CN117710893B CN202311787565.4A CN202311787565A CN117710893B CN 117710893 B CN117710893 B CN 117710893B CN 202311787565 A CN202311787565 A CN 202311787565A CN 117710893 B CN117710893 B CN 117710893B
- Authority
- CN
- China
- Prior art keywords
- building
- video
- image
- frame
- image capturing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000010276 construction Methods 0.000 claims abstract description 11
- 238000000034 method Methods 0.000 claims description 52
- 230000003139 buffering effect Effects 0.000 claims description 47
- 239000007787 solid Substances 0.000 claims description 43
- 238000013507 mapping Methods 0.000 claims description 5
- 239000000463 material Substances 0.000 claims description 4
- 238000012216 screening Methods 0.000 claims 2
- 230000006399 behavior Effects 0.000 description 47
- 230000008569 process Effects 0.000 description 20
- 238000007726 management method Methods 0.000 description 18
- 238000004364 calculation method Methods 0.000 description 11
- 230000033001 locomotion Effects 0.000 description 10
- 238000005192 partition Methods 0.000 description 9
- 238000012545 processing Methods 0.000 description 9
- 238000013139 quantization Methods 0.000 description 9
- 238000004422 calculation algorithm Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 7
- 230000000670 limiting effect Effects 0.000 description 7
- 230000002829 reductive effect Effects 0.000 description 7
- 238000003860 storage Methods 0.000 description 6
- 230000009466 transformation Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 206010063385 Intellectualisation Diseases 0.000 description 4
- 238000004590 computer program Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000000638 solvent extraction Methods 0.000 description 4
- 125000004122 cyclic group Chemical group 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 238000001179 sorption measurement Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003042 antagnostic effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000013016 damping Methods 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 210000003195 fascia Anatomy 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000036544 posture Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000011426 transformation method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
- G06T17/05—Geographic models
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Geometry (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Remote Sensing (AREA)
- Computer Graphics (AREA)
- Multimedia (AREA)
- Processing Or Creating Images (AREA)
Abstract
The application relates to the technical field of computers, and provides a multidimensional digital image intelligent campus digitizing system, which comprises a digitizing control center, a video acquisition unit, a video identification unit, an image construction unit, a digitizing unit, a digital image management unit, a video coding unit, a video cache unit and a video playing unit; the digital control center is respectively connected with and controls the video acquisition unit, the video identification unit, the portrait construction unit, the digital portrait management unit, the video coding unit, the video caching unit and the video playing unit. The multi-dimensional digital image intelligent campus digitizing system provided by the application improves the intelligentization degree and realizes the digital management of the multi-dimensional digital image intelligent campus.
Description
Technical Field
The application relates to the technical field of computers, in particular to a multidimensional digital image intelligent campus digitizing system.
Background
In the existing campus management process, a manager often collects the condition of the campus in a patrol mode, and analyzes the data collected according to the patrol, so that students, teachers and buildings in the campus are managed, and the intelligent degree is low.
Disclosure of Invention
The application provides a multi-dimensional digital image intelligent campus digitizing system, which aims to improve the degree of intellectualization and realize the intelligent campus digital management of the multi-dimensional digital image.
In a first aspect, the application provides a multidimensional digital image smart campus digitizing system, which comprises a digitizing control center, a video acquisition unit, a video identification unit, an image construction unit, a digitizing unit, a digital image management unit, a video encoding unit, a video buffering unit and a video playing unit; the digital control center is respectively connected with the video acquisition unit, the video identification unit, the portrait construction unit, the digital portrait management unit, the video coding unit, the video cache unit and the video playing unit and controls the digital portrait management unit, the video coding unit, the video cache unit and the video playing unit;
The video acquisition unit is used for: collecting video stream data to be processed in a campus area;
The video identification unit is used for: identifying the video stream data to be processed to obtain user behavior data and map data to be processed; the user behavior data comprise node coordinates of all limb nodes of the hand;
The portrait construction unit is used for: determining original user behavior information based on node coordinates of all limb nodes of the hand, and determining an original three-dimensional entity model of the building based on the map data to be processed;
The digitizing unit is used for: digitizing the original user behavior information and the original three-dimensional entity model to obtain digitized user behavior information and a digitized three-dimensional entity model;
The digital portrait management unit is used for: carrying out multidimensional digital image management based on the digitized user behavior information and the digitized three-dimensional entity model;
the video encoding unit is used for: encoding the video stream data to be processed;
The video buffering unit is used for: storing the video stream data to be processed;
The video playing unit is used for: and playing the video stream data to be processed.
In a second aspect, the present application provides a multi-dimensional digital image smart campus digitizing method, applied to the system according to the first aspect, comprising:
Collecting video stream data to be processed in a campus area;
identifying the video stream data to be processed to obtain user behavior data and map data to be processed; the user behavior data comprise node coordinates of all limb nodes of the hand;
Determining original user behavior information based on node coordinates of all limb nodes of the hand, and determining an original three-dimensional entity model of the building based on the map data to be processed;
Digitizing the original user behavior information and the original three-dimensional entity model to obtain digitized user behavior information and a digitized three-dimensional entity model;
and carrying out multidimensional digital image management based on the digitized user behavior information and the digitized three-dimensional entity model.
In a third aspect, the present application also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the multi-dimensional digital video smart campus digitizing method of the second aspect when executing the program.
In a fourth aspect, the present application also provides a computer readable storage medium comprising a computer program which when executed by the processor implements the multi-dimensional digital image smart campus digitizing method of the second aspect.
In a fifth aspect, the application also provides a computer program product comprising a computer program which when executed by the processor implements the multi-dimensional digital image smart campus digitizing method of the second aspect.
The embodiment of the application improves the degree of intellectualization and realizes the intelligent campus digital management of the multidimensional digital image.
Drawings
In order to more clearly illustrate the technical solutions of the present application, the following description will be given with a brief introduction to the drawings used in the embodiments or the description of the prior art, it being obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained from these drawings without the inventive effort of a person skilled in the art.
FIG. 1 is a schematic diagram of a multi-dimensional digital image smart campus digitizing system according to the present application;
FIG. 2 is a flow chart of the method for digitizing multi-dimensional digital images in smart campus provided by the application;
Fig. 3 is a schematic structural diagram of an electronic device provided by the present application.
Description of the embodiments
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The embodiments of the present application provide embodiments of a multi-dimensional digital image smart campus digitizing system, it should be noted that although a logic sequence is shown in the flow chart, the steps shown or described may be accomplished in a different order than that shown or described herein under certain data.
The following description is made with respect to terms related to embodiments of the present application:
Open street map (OpenStreetMap, OSM): the online map collaboration system is an online map collaboration plan, and aims to create a world map which is free in content and can be edited by all people, and OSM data can be freely downloaded and used. The OSM includes spatial data and attribute data.
The spatial data mainly includes three types: the points (Nodes), ways (Ways), and relationships (relationships), which originally make up the entire map screen. Nodes define the location of points in space; ways define lines or regions; relationships (optional) define relationships between elements. node defines a geographic coordinate point by longitude and latitude. Meanwhile, the altitude of the object can be marked by height=; the map level where the object is located and the number of layers in the building can be marked through layer= and level=; the names of objects are represented by place=. Andname =. Meanwhile, the way is also constituted by connecting a plurality of points (nodes) into a line (plane). The way is constituted by 2-2000 points (nodes). way may represent 3 graphical things as follows (non-closed line (Open polyline), closed line (Closed polyline), area). For way exceeding 2000nodes, this can be handled by segmentation. a) Open polyline; b) Closed polyline closed line: ending the connected wire. For example, a real loop subway can be represented. c) Area: the closed region. The regions are typically labeled using landuse =. Relation: one Relation is a description of the interrelationship (nodes, ways or other relations) of two or more primitives, the interrelationship being defined by roll, including: a) route: defining highways, bike lanes, railways, etc.; b) A plurality of polygons: defining areas such as buildings, river banks, etc.; c) Boundary: the door is used for defining administrative boundaries; d) Restriction: for describing limitations such as "not left turn".
The attribute data includes: tag tags are not map basic elements, but each element records data information through the Tag. The data is recorded by 'key' and 'value'. For example, residential roads may be defined by highway =residential; meanwhile, additional information may be added using additional namespaces, such as: maxspeed winter=, represents the highest speed limit in winter.
Model UV (Model UV): is a concept used in computer graphics. UV is a fingerprint coordinate, which is a two-dimensional coordinate representing the location of the texture on the model surface corresponding to each vertex. UV coordinates are typically expressed using (u, v), similar to (x, y) in a cartesian coordinate system on a plane. The UV coordinates are used to attach the texture to the model so that the texture maps correctly to various parts on the model surface. During model modeling, the vertices of the model are mapped to corresponding locations on the two-dimensional plane, which are the UV coordinates. By associating pixels on the texture image with the UV coordinates of the model, the effect of attaching the texture to the surface of the model can be achieved. The UV coordinates can precisely control the mapping mode of textures on the surface of the model, so that fine texture drawing and rendering effects are achieved. By editing the UV coordinates of the model, adjustments to the layout and distribution of textures on the model surface can be made. This is important in the areas of game development, animation, and virtual reality applications, as textures can enhance the details and realism of the model.
UV coordinates: the u, v texture map coordinates are abbreviated as information indicating the position of each point. UV coordinates refer to a plane where all texture maps are two-dimensional. The horizontal direction is U and the vertical direction is V, and any pixel on the image can be located by the two-dimensional UV coordinate system of this plane.
Referring to fig. 1, fig. 1 is a schematic diagram of a multi-dimensional digital image smart campus digitizing system according to the present application. The multi-dimensional digital image smart campus digitizing system provided by the embodiment of the application comprises a digitizing control center 10, a video acquisition unit 20, a video identification unit 30, an image construction unit 40, a digitizing unit 50, a digital image management unit 60, a video encoding unit 70, a video buffering unit 80 and a video playing unit 90. The digital control center 10 is respectively connected with and controls a video acquisition unit 20, a video identification unit 30, a portrait construction unit 40, a digitizing unit 50, a digital portrait management unit 60, a video coding unit 70, a video buffering unit 80 and a video playing unit 90.
In an embodiment, the video capturing unit 20 may capture video stream data to be processed in the campus area.
In an embodiment, the video recognition unit 20 may recognize the video stream data to be processed to obtain user behavior data and map data to be processed, where the user behavior data includes node coordinates of each limb node of the hand.
Optionally, first, the video stream data needs to be preprocessed, including steps such as video parsing, frame extraction, and image processing, to prepare for subsequent analysis and recognition. Human body gestures, actions and gestures in the video stream can be identified and tracked using computer vision and deep learning techniques. By detecting and analyzing information such as key points of a human body, movement tracks and the like, the behavior of a user such as walking, running, lifting hands, making a call and the like can be deduced, and the identification result can be converted into corresponding user behavior data. Optionally, the map may be identified and extracted by using a computer vision technology, and related information of the map to be processed may be obtained by using technologies such as target detection, image segmentation and feature extraction, so as to obtain data of the map to be processed.
In one embodiment, representation construction unit 40 determines raw user behavior information based on node coordinates of each limb node of the hand and determines a raw three-dimensional solid model of the building based on map data to be processed;
Optionally, the raw user behavior information is determined based on node coordinates of each limb node of the hand, typically by hand tracking techniques. Hand tracking technology is a technology for tracking and recording hand motion trajectories, gestures, and postures. A depth camera, infrared sensor or other sensor is typically used to capture motion trajectory and position data of the hand and convert it into a form of node coordinates.
Optionally, determining the original three-dimensional solid model of the building includes:
Step 10331, obtaining the map data to be processed of the target area, and deleting redundant data in the map data to be processed to obtain the target map data.
Illustratively, the target area is an area selected by the user on the OSM map where a three-dimensional solid model needs to be created. Since the OSM map has characteristics such that redundant map data of some non-target area is attached when map data of the target area is acquired, in order to reduce the amount of calculation, it is necessary to delete the redundant map data, specifically:
Step 10331a1, obtaining the map data to be processed of the specified target area from the open street map OSM, and determining other map data outside the target area contained in the map data to be processed as redundant data.
And step 10331a2, deleting redundant data from the map data to be processed to obtain target map data.
Illustratively, embodiments of the present application may afford a greater amount of iterative iterations. For example, creating multiple architectural variations in a short period of time, analyzing and categorizing OSM data types, attribute fields, etc., building a data operation framework through three-dimensional modeling application Houdini nodes, and then reading the attribute fields to create complete roads and buildings is surprisingly efficient if the display product is a more complex 3D scene. In summary, the more complex a tool function, the more problems may be automatically resolved, and the more time is saved.
It should be noted that, in order to facilitate explanation and explanation of each step in the embodiments, the embodiments of the present application are exemplified by Houdini three-dimensional modeling application. Those skilled in the art may implement the same technical effects with other three-dimensional modeling applications according to the method steps in the embodiments of the present application.
Step 10332, based on the attribute of each patch in the target map data, screens out a plurality of building patches associated with the building from the target map data.
After the target map data without redundant data is obtained, three-dimensional entity models of all buildings in the target area can be quickly built in batches based on the target map data.
Illustratively, the target map data includes a number of patches representing residences, businesses, parks, roads, rivers, etc., from which it is necessary to screen out patches related to a building and generate a three-dimensional solid model of the building based on such patches, since only the building needs to be modeled in batches quickly.
And 1033, dividing the corresponding building type of each building panel based on the attribute of each building panel and the area of the building panel, and determining the building type of each building.
Wherein the building types include: residential and non-residential buildings.
Illustratively, before the three-dimensional solid model of the building is generated, the building panels screened in the above steps are further classified into residential buildings and non-residential buildings.
It will be appreciated that since residential buildings are generally higher in height than other non-residential buildings, in order to achieve a rapid generation of a three-dimensional solid model of a building, and have a high degree of realism, residential and non-residential buildings can be distinguished by height.
And step 1033a1, deleting the building panel with the area smaller than or equal to the first preset threshold value from the plurality of building panels.
Step 1033a2, determining the building type of each building according to the attributes of the plurality of building panels, and adjusting the building type of each building based on a preset adjustment rule.
Wherein, preset adjustment rule includes: adjusting the building type of the building panel with the area larger than the second preset threshold value and the building type of the residential building to be a non-residential building, and adjusting the building type of the building panel with the area smaller than or equal to the second preset threshold value and the area larger than the first preset threshold value to be a residential building; the first preset threshold is smaller than the second preset threshold.
The building type of each building can be initially determined by the properties of each building panel, for example, but these properties are less reliable and therefore require further adjustment according to the preset adjustment rules described above.
Illustratively, by setting the minimum area threshold, the smallest patch that does not fit the building usage area is deleted, i.e., when the area of the building patch is small, the building patch may be deleted directly. Since the area of a residential building is typically not likely to be large (typically the area of three unit buildings), when the area of a building fascia is large, it can be determined as an industrial building (e.g., a factory building) or other non-residential building.
Step 1034, determining building parameter information based on the building type of each of the plurality of building panels, and generating a three-dimensional solid model to be processed for each building based on the building parameter information.
Wherein, the building parameter information includes: the floor height of the residential building is not the floor height of the residential building. Different types of buildings have height differences.
For example, after determining the building type of each building panel, the heights of the different types of buildings may be further determined and a three-dimensional solid model of the building may be generated based on the heights of the buildings.
Specifically, determining building parameter information based on the building type of each of the plurality of building panels in step 1034 described above may include the steps of 1034a:
Step 1034a randomly selects a first building level from the first random range as the building level of the building corresponding to the building panel of the building type residential building, and randomly selects a second building level from the second random range as the building level of the building corresponding to the building panel of the building type non-residential building.
Wherein any value in the first random range is greater than any value in the second random range; the first building level is greater than the second building level.
The heights of the buildings of the building type residential building can be unified as the first building layer height, or each building of the building type residential building can have a building layer height randomly generated according to the generation mode of the first building layer height.
The heights of the non-buildings with the building types of the residential buildings can be unified into the second building layer height, or each building with the building types of the non-residential buildings can have building layer heights randomly generated according to the generation mode of the second building layer height.
For example, after determining the level of each building, a three-dimensional solid model of each building may be generated.
Specifically, after the step 1034a, the method includes:
1034b1, generating coarse-grained three-dimensional solid models of the plurality of buildings on the basis of each building panel of the plurality of building panels based on the building parameter information; a building panel is used to generate a coarse-grained three-dimensional solid model of a building.
Illustratively, based on the floor height of the non-residential building and the floor height of the residential building indicated by the building parameter information, the coarse-grained three-dimensional solid model of the building corresponding to each building panel can be generated by stretching on the basis of each building panel. The coarse-grained three-dimensional solid model may be considered a polygonal cube box.
Step 1034b2, determining the appearance structure information of the corresponding building of each building panel according to the size and area of each building panel and the building parameter information.
Wherein the appearance structure information includes at least one of: the number of layers, the number of buildings, the appearance structure of the vertical face and the appearance structure of the top surface.
Illustratively, in order to be able to generate a real three-dimensional solid model of a building, it is necessary to generate appearance structure information of each building, i.e., floors, the number of buildings, a facade appearance structure, a ceiling appearance structure, etc., on the basis of the coarse-grained three-dimensional solid model.
Since the bottom surface of the three-dimensional solid model of the building cannot be observed, the bottom surface may be deleted or the appearance structure of the bottom surface may not be generated in order to reduce the amount of calculation and to increase the generation speed of the three-dimensional model.
And 1034b3, carrying out appearance adjustment on the coarse-grain three-dimensional solid model on the appearance information of the building corresponding to each building panel based on the appearance structure information, and generating a to-be-processed three-dimensional solid model of each building.
Illustratively, after appearance structure information of each building is obtained, the coarse-grained three-dimensional solid model of each building may be processed based on the appearance structure information to generate a to-be-processed three-dimensional solid model of each building.
It will be appreciated that after the appearance structure of each building is determined, further determination of its texture and texture is required, making the three-dimensional solid model more realistic.
Step 1035, dividing the surface of the three-dimensional solid model to be processed of each building into a top surface and a vertical surface, and generating different materials and textures for the top surface and the vertical surface of the three-dimensional solid model to be processed respectively.
Illustratively, different textures and textures may be provided for the top and bottom surfaces of each building, which may be randomly selected from a library of textures and textures, or may be fixed. At the same time, it is also necessary to provide materials and textures to the ground in the vicinity of the building.
And 1036, adjusting the model texture mapping coordinates of the to-be-processed three-dimensional entity model of each building based on the unified texture coordinate system, and generating an original three-dimensional entity model of each building.
In order to unify the texture of the model UV in a large scene in world space coordinates, it is also necessary to adjust the model texture map coordinates of the three-dimensional solid model to be processed of each building based on the unified texture coordinate system and generate the original three-dimensional solid model of each building.
Illustratively, to enable quick loading when previewing the model on the device, following step 1036 described above, the following step 1037 may be included:
Step 1037, generating texture maps used by each original three-dimensional solid model under different near-far lenses.
The resolution of the texture map used by the original three-dimensional solid model under the close-range lens is larger than that of the texture map used by the original three-dimensional solid model under the far-range lens.
Illustratively, texture processing for each building is processing of naming, formatting, size, mapping of textures. Firstly, naming of textures, for some models with larger simulation scenes, the textures used are also a large number, so that the textures need to be matched with the models and prefixed; the size of the texture is determined according to the practical situation of the scene, but the pixels in the length-width direction must be even-numbered, typically 256x256, 512x512, 1024x1024, 2048x2048, 4096x4096, 8192x8192, etc.
Illustratively, after the original three-dimensional solid model is generated, the presentation may be performed. According to the product display setting, model Assets are prepared in Houdini, the asset contents to be displayed are sorted, grouped and combined, and the Assets are imported into UnrealEngine and the names of the Unity engine in the hda format or the fbx format file, and then the Assets are loaded into the scene to be displayed, and then the model forms can be added according to the interactive use logic.
The embodiment of the application has the following advantages: mobile end performance optimization: at present, for three-dimensional scene application at the city level, the configuration requirement on a machine is relatively high, and particularly, for model display and business application at the city level, considerable limitation exists. The method is mainly characterized in that the total surface number of the three-dimensional scene model is expressed, and the surface number requirements of different volume levels can be dynamically set by the programmed building generation flow according to different environments. Model level detail is how far or near the distance between the object and the viewpoint determines how subtle. When the real-time display is performed, the data volume is large, real-time processing is needed, and if the fineness of the model is too high, the data processing speed is correspondingly reduced. Model layering: the three-dimensional model layer partitioning rules are classified according to attributes, classified according to model display complexity and building attributes, and given different model generation rules. Model blocking: the method is characterized in that the model is divided into different areas according to different spatial positions of the model, and a lower resource generation scheme is allocated to the model of the area with a relatively far scene view angle. And controlling the total data volume of the scene, improving the data processing speed and enabling the demonstration to be smoother.
In one embodiment, the digitizing unit 50 may digitize the original user behavior information and the original three-dimensional solid model to obtain digitized user behavior information and a digitized three-dimensional solid model.
Alternatively, the raw user behavior information is converted into computer-processable digital form, such as node coordinates, motion trajectories, etc., that can be manually recorded or captured by sensors. Digitization of user behavior information can be achieved by converting the data into a digital format, such as a matrix, vector, or time series. The original three-dimensional solid model is converted into a digitized form. This may be achieved by using a three-dimensional scanner, photogrammetry techniques or computer aided design software. In the digitizing process, the geometric shape, surface texture, color and other information of the entity model can be obtained and presented in the form of three-dimensional point cloud, polygonal grid or voxel representation. The digitized user behavior information and the digitized three-dimensional entity model are integrated, and the user behavior data can be associated with a specific part or area of the entity model according to specific requirements, so that a corresponding relation is established. For example, user behavior data is matched to specific node coordinates of the model to represent user operations or interactions on the model.
In one embodiment, digital representation management unit 60 may perform multidimensional digital representation management based on the digitized user behavior information and the digitized three-dimensional solid model.
Optionally, the digitized user behavior information and the three-dimensional entity model are integrated into a unified data management platform. This may involve modeling data to ensure that different types of data can be efficiently stored, retrieved and analyzed. Labels and association information are added to the user behavior information and the three-dimensional solid model to relate them. For example, specific user behavior data may be associated with corresponding three-dimensional solid model locations or regions, or descriptive labels may be added to specific properties of the solid model. A system capable of supporting multidimensional query and retrieval is designed, so that a user can screen and search data according to different dimensions. The multi-dimensional digital image is presented to the user using data visualization techniques. This may include using a three-dimensional model to demonstrate the effect of user behavior on a solid model, or graphically, etc., to demonstrate multi-dimensional features of user behavior information.
In an embodiment, the video encoding unit 70 may encode the video stream data to be processed, specifically:
Step 201, obtaining a real foreground video and a clean background image of the video stream data to be processed, obtaining a video to be encoded based on aliasing of the real foreground video and the clean background image, and adding noise matched with the real foreground video to a background area of the video to be encoded.
Specifically, a true foreground video and a clean background image of video stream data to be processed, which represents a production method for automatically producing content by adopting an artificial intelligence technology, can be obtained.
Here, the real foreground video mainly refers to video photographed in reality, and its contents include actual objects and scenes that are desired to be displayed. These objects and scenes may be people, objects, or a specific environment, etc., to which embodiments of the application are not particularly limited.
It should be noted that the embodiment of the present application may be applied to video stream data to be processed, may be applied to a video overlay scene, and may also be applied to a video composition scene, which is not particularly limited.
The clean background image refers to an image only comprising a background, the clean background image can be obtained directly, a foreground object can be extracted from a real foreground video, then the clean background image can be extracted from the foreground object by using a deep learning model, wherein the extraction of the foreground object from the real foreground video can be realized by using an object detection algorithm, a segmentation algorithm and the like; the deep learning model may be GAN (GENERATIVE ADVERSARIAL Networks, generating an antagonistic network), etc., which is not particularly limited in the embodiment of the present application.
After the true foreground video and the pure background image are obtained, the video to be encoded can be obtained based on the aliasing of the true foreground video and the pure background image, and the video to be encoded is the video which is required to be encoded subsequently.
After obtaining the video to be encoded, noise matched with the real foreground video can be added to the background area of the video to be encoded, and the noise position and the noise signal ratio r are recorded.
In addition, a watermark matched with the real foreground video can be added to the background area of the video to be encoded.
Step 202, determining the current frame of the video to be encoded;
In step 203, in the case that the current frame belongs to the first image group in the video to be encoded, the clean background image is set as the first frame, and the clean background image is encoded to obtain the encoding information, and in the case that the current frame does not belong to the first image group in the video to be encoded, the encoding information of the first frame in the previous image group of the current frame is copied as the encoding information of the first frame in the current image group.
In case the current frame belongs to the first group of pictures (Group Of Pictures, GOP) in the video to be encoded, the clean background picture is set as the first frame and identified as a virtual frame, only for encoding and not display at the decoding end.
Here, a group of pictures (GOP) typically contains multiple video encoded frames, the types of which are typically divided into I frames, P frames and B frames, where an I frame is also referred to as a key frame, and furthermore, the first frame of a GOP is typically an I frame, which can be used as a reference point for random access and can be considered a still image.
The clean background image may then be encoded to obtain encoded information, where the clean background image may be encoded using 26-4 encoding, 26-5 encoding, or the like, which is not particularly limited in embodiments of the present application.
The 26-4 code, also called BCD (Binary Coded Decimal) code, is a code way of converting decimal numbers into binary numbers. In this coding scheme, each decimal number corresponds to a four-bit binary number, e.g., 0 corresponds to 0000,1 corresponds to 0001, and so on. This coding scheme is commonly used in computer systems because it can conveniently convert decimal numbers to binary numbers for processing.
Among them, 26-5 coding, also called Gray Code (Gray Code) coding, is a coding scheme for converting decimal numbers into binary numbers. Unlike BCD coding, only one digit is different between two adjacent digits in gray code coding.
And copying the coding information of the first frame in the previous image group of the current frame as the coding information of the first frame in the current image group under the condition that the current frame does not belong to the first image group in the video to be coded.
It can be understood that the virtual reference frame is set for the noise-added background area and the fast predictive coding is performed, and the coding processes such as transformation are further skipped according to the noise signal ratio of the added noise, so that the video coding speed is increased, the coding complexity of the video coding is reduced, and the video coding efficiency is improved.
Step 204, performing coding tree division on the subsequent frames of the first frame of all the image groups of the video to be coded to obtain coding trees.
In particular, the coding tree may be obtained by performing coding tree partitioning of subsequent frames of the first frame of all image groups of the video to be coded.
When the coding tree is divided into the following frames of the first frame of all image groups of the video to be coded, the situation that the coding tree only comprises the background area of the video to be coded and the situation that the coding tree comprises the foreground area of the video to be coded can be considered, so that the noise adding background video and the real foreground video are separately processed by utilizing the characteristics of static characteristics of the noise adding background video background, priori characteristics of artificial noise and the like, a large number of conventional but unnecessary intra-frame prediction, inter-frame prediction processing and transformation coding processing processes are avoided in the video coding process, and the complexity of video coding is reduced.
In step 205, an encoded frame of the video to be encoded is determined based on the encoding information of the first frame of each group of pictures and the encoding tree of each group of pictures.
Specifically, after the coding information of the first frame of each image group and the coding tree of each image group are obtained, the coding information of the first frame of each image group and the coding tree of each image group may be integrated to obtain the coding frame of the video to be coded.
In the embodiment of the application, when the current frame belongs to the first image group in the video to be encoded, the clean background image is set as the first frame, the clean background image is encoded to obtain the encoding information, when the current frame does not belong to the first image group in the video to be encoded, the encoding information of the first frame in the previous image group of the current frame is copied to be used as the encoding information of the first frame in the current image group, the encoding tree is divided into encoding trees by the encoding tree of the first frame of all the image groups of the video to be encoded, and the encoding frame of the video to be encoded is determined based on the encoding information of the first frame of each image group and the encoding tree of each image group. The process sets a virtual reference frame for the noise-added background area and carries out quick predictive coding, and further skips coding processes such as transformation according to the noise signal ratio of the added noise, so that the video coding speed is increased, the coding complexity of the video coding is reduced, and the video coding efficiency is improved.
Based on the above embodiment, step 205 includes:
Step 2051, in the case that the coding tree of each image group only includes a background area of the video to be coded, performing predictive coding on the coding tree of each image group to obtain a prediction residual and a noise signal ratio;
step 2052, coding the prediction residual error based on the noise signal ratio to obtain a target coding frame;
Step 2053, determining an encoded frame of the video to be encoded based on the encoding information of the first frame of each image group and the target encoded frame.
Specifically, the characteristics of static characteristics of the background video background of the noise, priori characteristics of artificial noise and the like can be utilized to separate the background video of the noise from the real foreground video, so that unnecessary intra-frame predictive coding and inter-frame predictive coding of the background area of the noise are greatly reduced.
Therefore, in the case where the coding tree of each image group contains only the background region of the video to be coded, the coding tree of each image group can be predictive-coded to obtain a prediction residual and a noise size, where the noise size under the video stream data to be processed is equivalent to the prediction residual, and the noise signal ratio can be determined based on the noise size.
Before the Coding Tree of each image group is subjected to predictive Coding to obtain a prediction residual and a noise signal ratio, time-consuming operations such as CTU (Coding Tree Unit) depth division, coding Unit (Coding Unit) division, rate distortion calculation and the like can be directly skipped, the Coding Tree is designated as a skip inter-frame prediction mode, and a reference CU is a virtual frame corresponding CU, so that fast predictive Coding is performed.
Here, CTU depth partitioning, traversing CU partitioning, and rate distortion calculation are key operations in video coding. First, the CTU size is fixed, as specified by the encoder, and there are typically 16 x 16, 32 x 32 and 64 x 64. In HEVC (HIGH EFFICIENCY Video Coding), CTU adopts a quadtree Coding structure, and can perform recursive adaptive division according to complexity of an image region, thereby improving rate-distortion performance of Coding.
For each CTU, the partition of the CU will be performed inside. Each CU has its own partition depth, with a maximum partition depth of 3. To select the best CU coding depth, all possible partitions from 64×64 to 8×8 need to be traversed, for a total of 85 CUs. In this process, a prediction mode selection is made for each possible partition and its rate distortion cost is calculated. Finally, the optimal CU dividing mode with the minimum rate distortion cost is selected by comparing the rate distortion costs of different dividing modes.
Rate-distortion cost calculation is a core concept in HEVC, which considers the balance between coding efficiency and image quality. Specifically, for an LCU (largest coding unit), which is 64×64 CU in size, it is necessary to traverse all possible 64×64 to 8×8 partitions, calculate the rate-distortion cost for each partition mode, and thus select the best partition mode.
The prediction residual of a video frame is the difference between the predicted value and the actual value calculated by using the correlation of the video time domain to predict the pixels of the current image using the pixels of the neighboring encoded image.
It should be noted that, the process skillfully utilizes the equivalent relationship between the artificial noise and the prediction residual in the scene, further skips the calculation of the prediction residual, skips the encoding processes such as subsequent transformation as far as possible, and accelerates the encoding of the background area in the video to be encoded.
After the noise signal ratio is obtained, the prediction residual may be encoded based on the noise signal ratio to obtain a target encoded frame.
Here, the prediction residual may be encoded using any one or more of transform coding, quantization operation, and entropy coding, which is not particularly limited in the embodiment of the present application.
Here, the noise signal ratio is an important indicator for measuring the quality of an image frame, and reflects the anti-interference capability of the image frame. The target coding frame is all the subsequent frames of the first frame in each image group in the video to be coded obtained by final coding.
After the target coding frame is obtained, the coding frame of the video to be coded can be determined based on the coding information of the first frame of each image group and the target coding frame, namely, the coding information of the first frame of each image group and the target coding frame can be integrated to obtain the coding frame of the video to be coded.
Based on the above embodiment, step 2052 includes:
Executing a cyclic process until a preset condition is met:
The cyclic process includes:
Under the condition that the noise signal ratio is larger than a first preset threshold value, performing first coding operation on the prediction residual error to obtain a first coding frame;
Under the condition that the noise signal ratio is larger than a second preset threshold value and smaller than the first preset threshold value, performing second coding operation on the prediction residual error to obtain a second coding frame; the preset condition is that all frames of the video to be encoded are successfully encoded;
A target encoded frame is determined based on the first encoded frame or the second encoded frame.
Specifically, a loop process may be performed until a preset condition is satisfied:
The cyclic process includes:
Under the condition that the noise signal ratio is larger than a first preset threshold value, performing first coding operation on the prediction residual error to obtain a first coding frame;
And under the condition that the noise signal ratio is larger than a second preset threshold value and smaller than the first preset threshold value, performing second coding operation on the prediction residual error to obtain a second coding frame, wherein the preset condition is that all frames of the video to be coded are successfully coded.
Here, the first preset threshold may be denoted by th, and the first preset threshold may be 50, 60, 40, or the like, and the second preset threshold may be 0, which is not particularly limited in the embodiment of the present application.
Here, the first encoding operation includes transform encoding, quantization operation, and entropy encoding that are sequentially performed, and the second encoding operation includes quantization operation and entropy encoding that are sequentially performed.
That is, in the case where the noise is small, the gain by the transform coding is not significant, and the calculation amount is not reduced, and therefore, in the case where the noise signal ratio is greater than the second preset threshold and less than the first preset threshold, the transform coding can be skipped, thereby further reducing the coding calculation.
In addition, when the noise signal ratio is equal to a second preset threshold, i.e., when the noise signal ratio is equal to 0, the prediction residual may directly skip transform coding, quantization operation, and entropy coding, thereby further reducing coding calculation.
The specific process of the first coding operation is as follows:
First, the prediction residual is transform coded. This step is mainly to convert the data of the spatial domain to other domains, such as the frequency domain, in order to more efficiently represent the information of the data. Common transformation methods include discrete cosine Transform (Discrete Cosine Transform, DCT), discrete sine Transform (DISCRETE SINE Transform, DST), and the like.
Next, quantization operation is performed on the transformed data. The purpose of quantization is to map a large number of possible values to a small number of representative values, thereby reducing the code stream size.
The final step is entropy encoding, in order to further compress the data and increase the compression ratio. The entropy coding is a coding method based on the occurrence probability of data symbols, and can be used for representing the symbols with lower occurrence probability by using shorter code words and representing the symbols with higher occurrence probability by using longer code words, so as to achieve the purpose of compression. Common entropy coding methods include Huffman coding, arithmetic coding, and the like.
After the first encoded frame or the second encoded frame is obtained, a target encoded frame may be determined based on the first encoded frame or the second encoded frame.
Based on the above embodiment, step 205 further includes:
step 2054, in the case that the coding tree of each image group includes a foreground region of the video to be coded, performing intra-frame predictive coding for region definition on the coding tree of each image group, and/or inter-frame predictive coding for region definition, to obtain a region-defined coded frame;
step 2055 determines an encoded frame of the video to be encoded based on the encoding information of the first frame of each image group and the region-defined encoded frame.
Specifically, in the case where the coding tree of each image group includes a foreground region of a video to be coded, intra-prediction coding for region definition and/or inter-prediction coding for region definition may be performed on the coding tree of each image group, to obtain a region-defined coded frame.
Here, inter prediction coding such as Motion estimation and MV (Motion Vector) prediction coding.
The intra-frame prediction coding of the region limitation or the inter-frame prediction coding of the region limitation limits the region to the foreground region, thereby reducing the searching range, further reducing the complexity of video coding and improving the efficiency of video coding.
After the region-defined encoded frame is obtained, the encoded information of the first frame of each image group and the region-defined encoded frame may be integrated to obtain an encoded frame of the video to be encoded.
Based on the above embodiment, step 2054 includes:
And limiting the area of the reference coding unit of the coding tree of each image group to the foreground area of the current frame for intra-frame prediction coding, and/or limiting the area of the reference coding unit of the coding tree of each image group to the foreground area of the previous frame of the current frame for inter-frame prediction coding, so as to obtain the area limiting coding frame.
Specifically, the region of the reference coding unit of the coding tree of each image group may be limited to the foreground region of the current frame for intra-prediction coding, and/or the region of the reference coding unit of the coding tree of each image group may be limited to the foreground region of the previous frame of the current frame for inter-prediction coding, to obtain the region-limited coded frame.
Namely, the region of the reference coding unit of the coding tree of each image group can be limited to the foreground region of the current frame for intra-frame prediction coding, so that the optimal intra-frame prediction mode is obtained, and the region-limited coding frame is obtained; the region of the reference coding unit of the coding tree of each image group can be limited to the foreground region of the previous frame of the current frame for inter-frame predictive coding to obtain a region-limited coding frame; the region of the reference coding unit of the coding tree of each image group may be limited to the foreground region of the current frame for intra-prediction coding, and the region of the reference coding unit of the coding tree of each image group may be limited to the foreground region of the previous frame of the current frame for inter-prediction coding, and then the coding frame with the best effect may be selected from the intra-prediction coding and the inter-prediction coding as the region-limited coding frame, which is not particularly limited in the embodiment of the present application.
The embodiment of the application performs prediction coding of limiting the foreground region in the video to be coded: in intra prediction, the region of the reference coding unit is limited to the foreground video region of the current frame, and in inter prediction such as motion estimation and MV prediction, the search region of motion estimation, MV prediction and the like are also limited to the foreground video region of the previous frame, so that the coding process of the foreground region is accelerated.
The method comprises the steps of firstly, obtaining a real foreground video and a pure background image of video stream data to be processed, obtaining a video to be encoded based on aliasing of the real foreground video and the pure background image, and adding noise matched with the real foreground video to a background area of the video to be encoded.
And secondly, setting the clean background image as a first frame under the condition that the current frame belongs to the first image group in the video to be encoded, and encoding the clean background image to obtain encoding information, and copying the encoding information of the first frame in the previous image group of the current frame as the encoding information of the first frame in the current image group under the condition that the current frame does not belong to the first image group in the video to be encoded.
And thirdly, dividing the coding tree of the subsequent frames of the first frame of all image groups of the video to be coded to obtain the coding tree.
Fourth, in the case that the coding tree of each image group only includes the background region of the video to be coded, time-consuming operations such as CTU depth division, traversing CU division, and rate-distortion calculation are directly skipped, designated as skip inter-prediction mode, and the reference CU is the virtual frame corresponding CU, so as to perform fast predictive coding.
Fifthly, under the condition that the noise signal ratio is larger than a first preset threshold value, carrying out transformation coding, quantization operation and entropy coding on the prediction residual error in sequence to obtain a first coding frame;
Under the condition that the noise signal ratio is larger than a second preset threshold value and smaller than a first preset threshold value, sequentially carrying out quantization operation and entropy coding on the prediction residual error to obtain a second coding frame;
In case the noise signal ratio is equal to a second preset threshold, the prediction residual skips transform coding, quantization operation and entropy coding.
Sixth, in the case that the coding tree of each image group includes a foreground region of a video to be coded, limiting a region of a reference coding unit of the coding tree of each image group to a foreground region of a current frame for intra-prediction coding, and/or limiting a region of a reference coding unit of the coding tree of each image group to a foreground region of a previous frame of the current frame for inter-prediction coding, to obtain a region-limited coded frame.
And seventh, judging whether all frames are coded, and ending video coding of the video stream data to be processed under the condition that all frames are coded.
On one hand, the embodiment of the application utilizes the characteristics of static characteristics of the background of the noisy background video, priori characteristics of artificial noise and the like to separate the noisy background video from the real foreground video, thereby greatly reducing unnecessary intra-frame and inter-frame predictive coding of the noisy background area. Meanwhile, the equivalent relation between artificial noise and prediction residual errors in the scene is skillfully utilized, the calculation of the prediction residual errors is further skipped, the coding processes such as subsequent transformation and the like are skipped as much as possible, and the coding of a background area is accelerated; on the other hand, the foreground region is subjected to region-limited predictive coding: in intra prediction, the region of the reference CU is limited to the foreground video region of the current frame, and in inter prediction such as motion estimation and MV prediction, the search region of motion estimation, MV prediction, and the like are also limited to the foreground video region of the previous frame, thereby accelerating the encoding process of the foreground region.
In an embodiment, the video buffering unit 80 may store the video stream data to be processed, specifically:
step 301, responding to a playback operation of the video stream data to be processed, and segmenting a period from a playing time point of the video stream data to be processed to an ending time point of the video stream data to be processed to obtain a first period and a second period.
The duration of the first period may be set according to actual needs, for example, the first period may be set to 20 seconds, may also be set to 25 seconds or 30 seconds, and the like, and may specifically be set according to actual needs.
When playing back the video stream data to be processed, the server can return the existence of the video time period in the time period to the video server by inquiring whether the video in the corresponding time period is included or not. The video server may be a cloud storage server or a local video server, and may be specifically set according to actual needs.
In the case that the video time period exists in the time period, the user can execute playback operation on the video stream data to be processed, and in response to the playback operation on the video stream data to be processed, the play time point of the video stream data to be processed can be determined, and video frames after the play time point are cached so as to play back the video stream data to be processed based on the cached video frames.
When the video frames after the playing time point are cached, unlike the prior art, the complete video frames near the appointed time point are not cached according to the time sequence, but the time period from the playing time point of the video stream data to be processed to the ending time point of the video stream data to be processed is segmented, so that a first time period and a second time period are obtained, and different caching mechanisms are adopted to cache the video frames in the first time period and the second time period.
Step 302, sequentially buffering a plurality of first video frames in a first period according to a time sequence.
Wherein the plurality of first video frames may include a plurality of intra-coded frames (I-frames) and a plurality of intra-prediction-coded frames (P-frames).
When different caching mechanisms are adopted to cache video frames in the first period and the second period, two threads can be started. The thread 1 is used for sequentially caching a plurality of first video frames in a first period according to time sequence; and the thread 2 is used for caching a plurality of second video frames in a second period based on a preset caching strategy.
When the video frames after the playing time point are cached, a plurality of first video frames in a first period can be cached in sequence preferentially according to the time sequence, so that the video frames at the playing time point can be prevented from being blocked, and the smooth playing of the video frames at the playing time point is ensured.
For example, when the plurality of first video frames in the first period are buffered in sequence according to the time sequence preferentially, at least two possible implementation manners may be included:
In one possible implementation manner, after the plurality of first video frames in the first period are sequentially cached according to the time sequence, the plurality of second video frames in the second period are cached based on a preset caching strategy, so that the plurality of first video frames in the first period are cached preferentially, the video frames at the playing time point can be effectively prevented from being blocked, and smooth playing of the video frames at the playing time point is ensured.
For example, if the duration of the first period is 20 seconds, the thread 1 may be started first, after a plurality of first video frames within 20 seconds are buffered in sequence according to the time sequence, the thread 2 may be started again, and based on a preset buffering policy, a plurality of second video frames within the second period may be buffered.
In another possible implementation manner, after the first video frames of the multiple frames in the partial time periods are sequentially cached according to the time sequence, the second video frames in the second time period are synchronously cached based on a preset caching strategy, so that the first video frames in the first time period are preferentially cached, the video frames at the playing time point can be effectively prevented from being blocked, and the smooth playing of the video frames at the playing time point is ensured; and the buffering of the second video frames in the second period is started in advance, so that convenience is brought to better buffering of the second video frames in the second period.
For example, if the duration of the first period is 20 seconds, the thread 1 may be started first, after a plurality of first video frames within 10 seconds are buffered in sequence according to the time sequence, the thread 2 is started synchronously, so that the thread 2 and the thread 1 are parallel, and based on a preset buffering policy, a plurality of second video frames within the second period are buffered.
Step 303, caching a plurality of second video frames in a second period based on a preset caching strategy; wherein the plurality of second video frames include a plurality of intra-frame encoded frames and a plurality of intra-frame predictive encoded frames, and the preset buffering strategy includes: the buffer priority of the plurality of intra-coded frames is higher than the buffer priority of the plurality of intra-prediction coded frames.
Wherein, the buffer priority of the plurality of intra-frame encoded frames is higher than the buffer priority of the plurality of intra-frame predictive encoded frames, which can be understood as: the method comprises the steps of preferentially buffering a plurality of intra-frame coded frames, and buffering a plurality of intra-frame predictive coded frames after the plurality of intra-frame coded frames.
Considering that the intra-frame encoded frames have independent complete information, decoding can be independently performed without referring to other image frames, so that in order to enable a user to quickly drag the playing progress to a second period, video frames in a period after a designated time point can be quickly played, and therefore, for a plurality of second video frames in the second period, a plurality of intra-frame encoded frames can be cached preferentially, so that the hit rate of the user dragging time point in the second period is improved; and then a plurality of intra-frame predictive coding frames are cached, so that the video watching experience of a user can be improved to a certain extent.
For example, when the plurality of second video frames in the second period are buffered based on the preset buffering policy, whether the buffering of the first video frames in the third period is completed may be detected, and when the buffering of the first video frames in the third period is completed, the plurality of second video frames are buffered based on the preset buffering policy.
The starting time point of the third period is a playing time point, and the ending time point of the third period is earlier than the ending time point of the first period, or the ending time point of the third period is the ending time point of the first period.
It can be understood that, when the ending time point of the third period is the ending time point of the first period, that is, after the plurality of first video frames in the first period are sequentially cached according to the time sequence, based on the preset caching policy, the plurality of second video frames in the second period are cached, which can be referred to the first possible implementation manner in step 302 above; when the ending time point of the third period is earlier than the ending time point of the first period, that is, after sequentially buffering the multi-frame first video frames in the first period according to the time sequence, the synchronization is based on a preset buffering policy, and a plurality of second video frames in the second period are buffered, which can be seen in the second possible implementation manner in the step 302, and the third period is the partial period in the step 302, and can be specifically set according to actual needs.
It can be seen that when the video frames are cached, responding to the playback operation of the video stream data to be processed, segmenting the time period from the playing time point of the video stream data to be processed to the ending time point of the video stream data to be processed to obtain a first time period and a second time period; sequentially caching a plurality of first video frames in a first period according to time sequence; based on a preset caching strategy, caching a plurality of second video frames in a second period; wherein the plurality of second video frames include a plurality of intra-frame encoded frames and a plurality of intra-frame predictive encoded frames, and the preset buffering strategy includes: the buffer priority of the plurality of intra-frame coding frames is higher than that of the plurality of intra-frame predictive coding frames, so that different buffer mechanisms are adopted to buffer the video frames in the first period and the second period, the video frames at the playing time point can be prevented from being blocked, and the video frames at the playing time point can be ensured to be smoothly played; and by preferentially caching a plurality of intra-frame encoded frames in the second period, even if the user drags the playing progress to the second period, the video frames in the period after the designated time point can be played rapidly based on the plurality of intra-frame encoded frames in the second period which are preferentially cached, so that the video viewing experience of the user can be improved to a certain extent.
For example, the first period may update the period range once every 1 second, and for the second period, since the period range relates to the hierarchical cache policy, in order to avoid having a larger influence on the hierarchical cache policy, the update duration of the second period may be longer than the update duration of the first period, for example, the update duration of the second period may be set to 10 seconds, that is, the period range is updated once every 10 seconds, so that frequent updating of the period range of the second period may be avoided, thereby avoiding having a larger influence on the hierarchical cache policy, and in particular, may be set according to actual needs.
It will be appreciated that the thread 1 for buffering a plurality of first video frames in a first period and the thread 2 for buffering a plurality of second video frames in a second period may multiplex the same buffer space, so that when the thread 1 and the thread 2 request video frames to be buffered, whether the corresponding video frames are buffered or not may be searched in the local buffer space first, and if the corresponding video frames are buffered in the local buffer space, the video frames need not be requested from the video server.
Step 304, determining a priority corresponding to each intra-frame of the plurality of intra-frames.
Illustratively, when determining the priority corresponding to each of the plurality of intra-coded frames, the priority corresponding to each of the plurality of intra-coded frames may be determined based on a two-interval algorithm; the priority corresponding to each intra-frame of the plurality of intra-frames may be determined based on other algorithms, for example, a three-partition algorithm, and may be specifically set according to actual needs.
In the following description, the determination of the priority corresponding to each intra-coded frame of the plurality of intra-coded frames based on the two-interval algorithm will be described as an example, but the embodiment of the present application is not limited thereto.
For example, if there are a total of 100 intra-coded frames in the second period, the priorities corresponding to the 100 intra-coded frames may be determined based on a two-interval algorithm, where the order of the priorities from high to low is assumed to be: the 50 th, 25 th, 75 th, 13 th, 37 th, 63 th, 88 th, … th frames, the 100 intra-coded frames may be buffered in turn based on their respective priorities.
Step 305, sequentially buffering a plurality of intra-coded frames based on the priorities corresponding to the intra-coded frames.
For example, when a plurality of intra-coded frames are sequentially buffered based on the priority corresponding to each intra-coded frame, the plurality of intra-coded frames may be sequentially buffered based on a hierarchical buffering policy. When sequentially caching a plurality of intra-frame encoded frames based on a hierarchical caching strategy, determining a plurality of corresponding levels and the number of the intra-frame encoded frames to be cached in each level based on the number of the plurality of intra-frame encoded frames; determining the intra-frame to be cached in each level from a plurality of intra-frame coding frames based on the priority corresponding to each intra-frame coding frame and the number of the intra-frame coding frames to be cached in each level; and sequentially caching a plurality of intra-frame coded frames based on the intra-frame coded frames to be cached in each level.
Therefore, by preferentially caching the plurality of intra-frame encoded frames in the second period, even if the user drags the playing progress to the second period, the video frames in the period after the appointed time point can be played rapidly based on the plurality of intra-frame encoded frames in the second period which are preferentially cached, and the video viewing experience of the user can be improved to a certain extent.
Still assuming that there are a total of 100 intra-coded frames in the second period, based on the two-interval algorithm, determining that the priority corresponding to the 100 intra-coded frames is in order from high to low is as follows: the 50 th frame, the 25 th frame, the 75 th frame, the 13 th frame, the 37 th frame, the 63 st frame, the 88 th frame and … th frame, it can be determined that 1 intra-frame coding frame to be cached in the 1 st layer is 1, and the intra-frame coding frame to be cached in the 1 st layer is the 50 th frame with the highest priority; the number of the intra-frame coding frames to be cached in the 2 nd layer is 2, and the number of the intra-frame coding frames to be cached in the 2 nd layer is 25 th frame or 75 th frame; the number of the intra-frame coding frames to be cached in the 3 rd layer is 4, the number of the intra-frame coding frames to be cached in the 3 rd layer is 13 th frame, 37 th frame, 63 rd frame and 88 th frame, and so on, after determining the intra-frame coding frames to be cached in each layer, 100 intra-frame coding frames can be cached in turn based on the intra-frame coding frames to be cached in each layer.
Illustratively, when the 100 intra-coded frames are sequentially buffered based on the intra-coded frames to be buffered for each level, the 50 th frame may be buffered into the 1 st level, the 25 th and 75 th frames may be buffered into the 2 nd level, the 13 th, 37 th, 63 th and 88 th frames may be buffered into the 3 rd level, and so on until the 100 intra-coded frames are buffered.
It can be understood that by preferentially buffering the plurality of intra-frame encoded frames in the second period and sequentially buffering the plurality of intra-frame encoded frames according to the priority corresponding to each intra-frame encoded frame, the buffered plurality of intra-frame encoded frames can be distributed more uniformly in a limited buffering time, so that the hit rate of the user dragging time point in the second period is improved.
Illustratively, the case of "continuously buffering I frames" and the case of "hierarchically buffering I frames according to the priority level to which each I frame corresponds" are included, and the I frames are typically spaced apart by 2 seconds. Assuming that the buffering time is 40 seconds, the number of buffered I frames in the two schemes is the same within 40 seconds, specifically: in the case of "continuously buffered I frames", 0, 2,4, 6, …, 20 seconds are buffered for a total of 11I frames; in the case of "hierarchically buffering I frames according to the priority levels to which each I frame corresponds", 0, 4, 6, 8, and 8 are buffered. . . And 11I frames in total, 40 seconds.
The number of the cached I frames is the same, but under the condition of layering and caching the I frames according to the priority corresponding to each I frame, the cached 11 frames are distributed more uniformly within 40 seconds of the caching time, and the caching time can be better covered. Therefore, when the user drags the progress to the second period, the video frames in the period after the designated time point can be quickly played by only using the locally cached I frames, so that the video viewing experience of the user is improved to a certain extent.
For example, when the I-frame buffering within the second period is completed by 50% (the percentage is a, and a is 50%), the accuracy of the time axis is adjusted and the time accuracy T is adjusted to 2/a seconds when the current user drags the time progress to the range of the second period. When the A is 50%, the time precision T is 4 seconds, the minimum unit of time display is 4 seconds in the process of dragging the time progress by the user, and any time point in the second period can be timely played based on the cached video frame. In addition, in order to improve the hit rate of the user dragging time point in the second period, it is also considered that when the user drags to the cached I-frame time point under the condition of unchanged time precision, the effect of the mobile phone such as damping increase can be improved, and the user can directly play pictures at the time point.
Step 306, in the case that the buffering of the plurality of intra-frame coded frames is completed, buffering the plurality of intra-frame predictive coded frames.
For example, when the buffering of the plurality of intra-frame prediction encoded frames is completed, a plurality of first intra-frame prediction encoded frames among the plurality of intra-frame prediction encoded frames may be buffered first, the first intra-frame prediction encoded frame being an intra-frame prediction encoded frame adjacent to each intra-frame prediction encoded frame for the first buffering operation; determining a plurality of second intra-prediction encoded frames adjacent to the plurality of first intra-prediction encoded frames from the plurality of intra-prediction encoded frames; and determining the plurality of second intra-frame predictive coding frames as a new plurality of first intra-frame predictive coding frames, and repeatedly executing the operation until the plurality of intra-frame predictive coding frame buffering is completed.
As will be apparent from the foregoing description, when buffering the plurality of P frames in the second period, buffering the plurality of P frames in the second period may continue according to a similar hierarchical prediction strategy. Because the P frames can be normally played only by continuous buffering in the interval of the I frame group, when a plurality of P frames in the second period are buffered, the 1 st P frame after all the I frames can be buffered preferentially according to a layering strategy; after the 1 st P frame after all I frames is cached, continuing to cache the 2 nd P frames after all I frames according to a layering strategy; after the 2 nd P frame after all the I frames is cached, continuing to cache the 3 rd P frame after all the I frames according to a layering strategy, and so on until the P frame after all the I frames is cached.
When caching a plurality of first intra-frame predictive coding frames in a plurality of intra-frame predictive coding frames, determining a caching sequence of the plurality of first intra-frame predictive coding frames based on priorities of the intra-frame coding frames corresponding to the plurality of first intra-frame predictive coding frames; the priority of the intra-frame coding frame and the buffer sequence of the corresponding first intra-frame predictive coding frame are in positive correlation; the plurality of first intra-prediction encoded frames are sequentially buffered based on the buffering order of the plurality of first intra-prediction encoded frames.
Taking the 1 st P frame after buffering all I frames with priority as an example, when the 1 st P frame after buffering all I frames with priority, the 1 st P frame after the I frame with highest priority can be buffered firstly based on the priorities corresponding to all I frames; secondly, caching the 1 st P frame after the I frame with the highest priority; and caching the 1 st P frame after the I frame with the third priority, and so on until the 1 st P frame after all the I frames are cached.
When caching a plurality of second video frames in the second period, determining a priority corresponding to each intra-frame coded frame in the plurality of intra-frame coded frames, and sequentially caching the plurality of intra-frame coded frames based on the priority corresponding to each intra-frame coded frame; in the case where the buffering of the plurality of intra-coded frames is completed, the plurality of intra-prediction-coded frames are buffered. Therefore, by preferentially caching the plurality of intra-frame encoded frames in the second period, even if the user drags the playing progress to the second period, the video frames in the period after the appointed time point can be played rapidly based on the plurality of intra-frame encoded frames in the second period which are preferentially cached, and the video viewing experience of the user can be improved to a certain extent.
It can be understood that, under the condition of good network condition, the video frame layering of the playing time point can be further increased, and the video frame can be obtained, so that the corresponding playback video frame can be rapidly played under the condition that the user drags the playing progress in the reverse direction. If the video frames in the front and back directions are obtained in a layering way, the following steps are adopted: the sequence of the future time I frame, the past time I frame, the future time P frame and the past time P frame is used for carrying out layered acquisition on the video frames in the front and back directions, and the video frames can be specifically set according to actual needs, so that the embodiment of the application is not repeated.
Based on any one of the embodiments, in the process of performing video frame buffering according to the embodiment of the present application, if a user detects that there is a progress touch operation on video stream data to be processed, the progress touch operation on the video stream data to be processed may be responded, and in the case that a play time point after the progress touch operation is detected is in a second period, based on the buffered video frames in the second period, the video frames after the play time point are played, because a plurality of intra-frame encoded frames in the second period are buffered preferentially, even if the user drags the play progress to the second period, the video frames in the period after the specified time point may be played rapidly based on the plurality of intra-frame encoded frames in the second period buffered preferentially, so that the video viewing experience of the user may be improved to a certain extent.
In an embodiment, the video playing unit 90 may play the video stream data to be processed, specifically:
Step 401, determining the size of a video playing window, and respective resolutions of at least two image capturing apparatuses playing videos through the video playing window.
The video playing window is used for playing videos shot by at least two camera devices, and the size of the video playing window can comprise the horizontal size and the vertical size of the video playing window.
In some embodiments, if the video playing window is a playing window of the video playing software, the size of the video playing window may be determined based on the display resolution of the corresponding video playing device and the operation mode of the video playing software. For example, the running modes of the video playing software include a full-screen mode and a downward restoring mode, and in the full-screen mode, the video playing window is full of the display interface of the video playing device, so that the size of the video playing window can be obtained through the display resolution of the video playing device; in the restore-down mode, the video playback window is not full of the display interface of the video playback device, and thus the size of the video playback window needs to be determined in combination with the display resolution of the video playback device and the restore-down mode.
The image capturing apparatus is used for capturing a corresponding video, and the image capturing apparatus may include, for example, a fisheye camera, a dome camera, a four-eye camera, a barrel camera, and the like, which is not limited in the embodiment of the present application. Wherein each image capturing apparatus has a corresponding resolution, the resolution of the image capturing apparatus may include a horizontal resolution for indicating the number of pixels included in the horizontal direction by the corresponding image capturing apparatus and a vertical resolution for indicating the number of pixels included in the vertical direction by the corresponding image capturing apparatus.
Step 402, determining respective pane areas of at least two image capturing apparatuses from the video playback window based on respective resolutions of the at least two image capturing apparatuses and sizes of the video playback window.
After the resolution of each image capturing apparatus is obtained, a pane area of each image capturing apparatus can be determined in the video playback window based on the resolution of each image capturing apparatus and the size of the video playback window. For each image capturing apparatus, a pane area of the image capturing apparatus is an area in which the image capturing apparatus plays a captured video in a video play window. Since the pane area of the image capturing apparatus is determined based on the resolution of the image capturing apparatus, the size of the pane area of the image capturing apparatus is highly adapted to the resolution of the image capturing apparatus, and the utilization rate of the video playback window can be improved without excessive distortion of the screen.
In each pane area, a video corresponding to the image capturing apparatus is played in step 403.
After the pane areas of the respective image pickup apparatuses are determined, the video of the corresponding image pickup apparatus can be played in the respective pane areas.
The method comprises the steps of firstly determining the size of a video playing window, playing the respective resolutions of at least two camera devices of a video through the video playing window, then determining the respective pane areas of the at least two camera devices from the video playing window based on the respective resolutions of the at least two camera devices and the size of the video playing window, and playing the video of the corresponding camera device in each pane area.
Step 404, for each image capturing apparatus, determines the type of image size captured by the image capturing apparatus based on the resolution of the image capturing apparatus.
After the resolution of each image capturing apparatus is obtained, the image size type captured by each image capturing apparatus may be determined according to the horizontal resolution and the vertical resolution of each image capturing apparatus, wherein the image size type includes a first image size type and a second image size type. The first image size type is such that the horizontal resolution of the image pickup apparatus is less than or equal to the vertical resolution (hereinafter, simply referred to as a type in the embodiments), and the second image size type is such that the horizontal resolution of the image pickup apparatus is greater than the vertical resolution (hereinafter, simply referred to as a type B in the embodiments); or the first image size type is a B type and the second image size type is an a type.
For any image capturing apparatus, after the resolution of the image capturing apparatus is obtained, the type of image size captured by the image capturing apparatus can be determined by comparing the horizontal resolution and the vertical resolution of the image capturing apparatus.
Step 405, determining respective window layout areas of at least two image capturing apparatuses in the layout areas corresponding to the video playing windows based on the respective resolutions and image size types of the at least two image capturing apparatuses.
At least two image capturing apparatuses of the first image size type are included in the at least two image capturing apparatuses. Only the image capturing apparatus of the first image size type may be included in the at least two image capturing apparatuses, or the image capturing apparatus of the first image size type and the image capturing apparatus of the second image size type may be included in the at least two image capturing apparatuses.
How to determine the window layout area of the image capturing apparatus of the first image size type in the case where the image capturing apparatus of the first image size type is included in at least two image capturing apparatuses will be described below with reference to the drawings.
After the resolutions of the image capturing apparatuses of the at least two first image size types are obtained, window layout areas of the respective image capturing apparatuses of the at least two first image size types are determined in the layout areas, starting from a first position of the layout area corresponding to the video playback window, in order of the vertical resolution of the image capturing apparatuses of the at least two first image size types from high to low.
Step 406, determining a window layout area of a first image capturing apparatus of a maximum vertical resolution among the image capturing apparatuses of the at least two first image size types, the window layout area of the first image capturing apparatus being located at a first position.
The image pickup apparatus of the first image size type may be either an a-type image pickup apparatus or a B-type image pickup apparatus. In the following embodiments, for convenience of description, an image capturing apparatus of the first image size type is described as an example of an image capturing apparatus of the a type, and those skilled in the art will recognize that in the case where the image capturing apparatus of the first image size type is an image capturing apparatus of the B type, the manner of determining the corresponding window layout area is similar.
First, a window layout area of a first image pickup apparatus, which is an image pickup apparatus of the maximum vertical resolution among image pickup apparatuses of at least two first image size types, is determined. The window layout area of the first image capturing apparatus is located at a first position in the layout area corresponding to the video playing window, and the first position may be, for example, the leftmost side of the layout area, the rightmost side of the layout area, the upper left corner of the layout area, the upper right corner of the layout area, the lower left corner of the layout area, the lower right corner of the layout area, or the like, which is not limited in the embodiment of the present application.
The number of the first image capturing apparatuses may be one or more, and if the number of the first image capturing apparatuses is one, the window layout area of the one first image capturing apparatus is set at the first position. If the number of the first image capturing apparatuses is plural, window layout areas of the plural first image capturing apparatuses may be determined, respectively. For example, the window layout area of any one of the plurality of first image capturing apparatuses may be set at the first position, and the other first image capturing apparatuses are sequentially adjacent apparatuses.
The dashed line illustrates a layout area corresponding to the video playing window, where the layout area includes a first layout area and a second layout area, where the first layout area is a layout area corresponding to the first image size type, and the second layout area is a layout area corresponding to the second image size type. The image capturing apparatuses having the maximum vertical resolution determined in the image capturing apparatuses of the type a, including the image capturing apparatus 1, the image capturing apparatus 2, and the image capturing apparatus 3, respectively, may first determine window layout areas of the image capturing apparatus 1, the image capturing apparatus 2, and the image capturing apparatus 3.
Taking the first position as the leftmost side of the layout area as an example, since the vertical resolutions of the image capturing apparatus 1, the image capturing apparatus 2, and the image capturing apparatus 3 are equal, the window layout area of any one of the three image capturing apparatuses may be set at the leftmost side of the layout area, and then the window layout areas of the remaining two image capturing apparatuses may be set adjacently in order.
In one possible implementation, in the case where the vertical resolutions of the image capturing apparatus 1, the image capturing apparatus 2, and the image capturing apparatus 3 are equal, the horizontal resolutions of the image capturing apparatus 1, the image capturing apparatus 2, and the image capturing apparatus 3 may be compared, and then the window layout areas of the image capturing apparatus 1, the image capturing apparatus 2, and the image capturing apparatus 3 are determined based on the horizontal resolutions. The horizontal resolutions of the image capturing apparatus 1, the image capturing apparatus 2, and the image capturing apparatus 3 are the image capturing apparatus 1, the image capturing apparatus 2, and the image capturing apparatus 3 in this order from large to small, the window layout area of the image capturing apparatus 1 may be set at the leftmost side of the layout area, and then the window layout area of the image capturing apparatus 2 is set adjacent to the window layout area of the image capturing apparatus 1, and the window layout area of the image capturing apparatus 3 is set adjacent to the window layout area of the image capturing apparatus 2.
In step 407, in the case where there are first other image capturing apparatuses other than the first image capturing apparatus among the at least two image capturing apparatuses of the first image size type, a second image capturing apparatus of the second largest vertical resolution is determined from the first other image capturing apparatuses.
After determining the window layout area of the first image capturing apparatus, it is determined whether or not there are first other image capturing apparatuses other than the first image capturing apparatus among the at least two image capturing apparatuses of the first image size type. If the first other image capturing apparatus does not exist, all window layout areas of the image capturing apparatuses of the first image size type are determined; if there is a first other image capturing apparatus, it is necessary to further determine the window layout area of the first other image capturing apparatus.
Specifically, in the case where it is determined that there is a first other image capturing apparatus, first, a second image capturing apparatus of a second largest vertical resolution is determined from the first other image capturing apparatuses, wherein the second image capturing apparatus has a vertical resolution smaller than that of the first image capturing apparatus, but is the image capturing apparatus of which the vertical resolution is the largest among the first other image capturing apparatuses.
Step 408, determining a window layout area of the second image capturing apparatus based on the resolution of the second image capturing apparatus, the window layout area of the second image capturing apparatus being adjacent to the window layout area of the first image capturing apparatus.
After the second image capturing apparatus is determined, a window layout area of the second image capturing apparatus may be obtained based on the resolution of the second image capturing apparatus, the window layout area of the second image capturing apparatus being adjacent to the window layout area of the first image capturing apparatus, and a top of the window layout area of the second image capturing apparatus being flush with a top of the window layout area of the first image capturing apparatus.
After the window layout areas of the image pickup apparatus 1, the image pickup apparatus 2, and the image pickup apparatus 3 are determined in the layout areas, the image pickup apparatus having the largest vertical resolution, that is, the image pickup apparatus 4, is determined in the remaining image pickup apparatuses of type a, and then the window layout area of the image pickup apparatus 4 is determined based on the resolution of the image pickup apparatus 4, the window layout area of the image pickup apparatus 4 being adjacent to the window layout area of the image pickup apparatus 3.
In step 409, in the case where there is a second other image capturing apparatus other than the second image capturing apparatus in the first other image capturing apparatus, a window layout area of the second other image capturing apparatus is determined based on the resolution of the first image capturing apparatus, the resolution of the second image capturing apparatus, and the resolution of the second other image capturing apparatus.
After determining the window layout area of the second image capturing apparatus, it is necessary to further determine whether or not there is a second other image capturing apparatus other than the second image capturing apparatus in the first other image capturing apparatus, and if not, the window layout areas of the image capturing apparatuses of the first image size type have all been determined, and if so, it is necessary to further determine the window layout areas of the second other image capturing apparatuses.
In the case where there is a second other image pickup apparatus in the first other image pickup apparatus, it is first determined whether there is a third image pickup apparatus in the second other image pickup apparatus, wherein a vertical resolution of the third image pickup apparatus is smaller than or equal to a difference between the vertical resolution of the first image pickup apparatus and the vertical resolution of the second image pickup apparatus, and a horizontal resolution of the third image pickup apparatus is smaller than or equal to a horizontal resolution of the second image pickup apparatus.
In the case where the third image capturing apparatus is present, a window layout area of the third image capturing apparatus may be determined based on the resolution of the third image capturing apparatus, the window layout area of the third image capturing apparatus being adjacent to both the window layout area of the first image capturing apparatus and the window layout area of the second image capturing apparatus.
After the window layout area of the image pickup apparatus 4 is determined, in the remaining image pickup apparatuses of the a type, it is determined whether or not there is a third image pickup apparatus satisfying that the vertical resolution is smaller than the difference between the vertical resolution of the image pickup apparatus 1 and the vertical resolution of the image pickup apparatus 4, and the horizontal resolution is smaller than or equal to the horizontal resolution of the image pickup apparatus 4, and if there is a third image pickup apparatus, a window layout area of the corresponding third image pickup apparatus may be set below the window layout area of the image pickup apparatus 4.
The image pickup apparatus 5 satisfies the above condition, the vertical resolution of the image pickup apparatus 5 is equal to the difference between the vertical resolution of the image pickup apparatus 1 and the vertical resolution of the image pickup apparatus 4, and the horizontal resolution of the image pickup apparatus 5 is smaller than the horizontal resolution of the image pickup apparatus 4. Accordingly, the window layout area of the image pickup apparatus 5 can be determined based on the resolution of the image pickup apparatus 5, the window layout area of the image pickup apparatus 5 being located below the window layout area of the image pickup apparatus 4 and being adjacent to the window layout areas of the image pickup apparatus 3, the image pickup apparatus 4, respectively.
After determining the window layout area of the third image capturing apparatus, it is determined whether or not a third other image capturing apparatus other than the third image capturing apparatus exists in the second other image capturing apparatus.
In the case where a third other image capturing apparatus other than the third image capturing apparatus exists in the second other image capturing apparatus, determining from the third other image capturing apparatus whether or not there is a fourth image capturing apparatus, a vertical resolution of the fourth image capturing apparatus being smaller than or equal to a difference between a vertical resolution of the first image capturing apparatus and a vertical resolution of the second image capturing apparatus, a horizontal resolution of the fourth image capturing apparatus being smaller than or equal to a difference between a horizontal resolution of the second image capturing apparatus and a horizontal resolution of the third image capturing apparatus.
After the window layout area of the image pickup apparatus 5 is determined, only a first other area than the window layout area of the image pickup apparatus 5 exists below the window layout area of the image pickup apparatus 4, and the first other area can only satisfy the image pickup apparatus layout in which the vertical resolution is smaller than or equal to the difference between the vertical resolution of the image pickup apparatus 1 and the vertical resolution of the image pickup apparatus 4, and the horizontal resolution is smaller than or equal to the difference between the horizontal resolution of the image pickup apparatus 4 and the horizontal resolution of the image pickup apparatus 5.
Determining a window layout area of the fourth image capturing apparatus based on the resolution of the fourth image capturing apparatus in the case where the fourth image capturing apparatus is present, until no image capturing apparatus whose resolution satisfies the remaining area size is present in the third other image capturing apparatus, the window layout area of the fourth image capturing apparatus being adjacent to the window layout area of the second image capturing apparatus and the window layout area of the third image capturing apparatus; wherein the remaining area is an area other than the window layout area of the second image capturing apparatus, the window layout area of the third image capturing apparatus, and the window layout area of the fourth image capturing apparatus, out of the areas with the maximum vertical resolution starting from the top of the window layout area of the second image capturing apparatus.
Among the remaining type a image pickup apparatuses, the image pickup apparatus 6 satisfies that the vertical resolution is less than or equal to the difference between the vertical resolution of the image pickup apparatus 1 and the vertical resolution of the image pickup apparatus 4, and the horizontal resolution is less than or equal to the difference between the horizontal resolution of the image pickup apparatus 4 and the horizontal resolution of the image pickup apparatus 5, so that the window layout area of the image pickup apparatus 6 is located below the image pickup apparatus 4, and the window layout area of the image pickup apparatus 6 is adjacent to both the window layout area of the image pickup apparatus 4 and the window layout area of the image pickup apparatus 5.
Then, it is further determined whether or not there is an image pickup apparatus satisfying the remaining area size among the remaining a-type image pickup apparatuses. The remaining area is an area other than the image pickup apparatus 4, the image pickup apparatus 5, and the image pickup apparatus 6, out of areas starting from the top of the image pickup apparatus 4 and having a height of the vertical resolution of the image pickup apparatus 1.
If there is an image pickup apparatus satisfying the remaining area size, a window layout area of the corresponding image pickup apparatus may be determined in the remaining area until there is no image pickup apparatus whose resolution satisfies the remaining area size.
Further, since there is no image pickup apparatus satisfying the remaining area size, the image pickup apparatus having the largest vertical resolution among the remaining image pickup apparatuses of the type a may be determined later as a new second image pickup apparatus, and the above-described operations are repeatedly performed until the window layout areas of all the image pickup apparatuses of the type a are determined. The image pickup apparatus 7 may be regarded as a new second image pickup apparatus, with its window layout area adjacent to that of the image pickup apparatus 4.
In the above-described embodiments, an implementation of how to determine the window layout area of the third image capturing apparatus in the case where the third image capturing apparatus exists is described. An implementation in the absence of the third image capturing apparatus is described below.
In the case where the third image capturing apparatus is not present, the fifth image capturing apparatus may be determined from the second other image capturing apparatuses. Optionally, the fifth image capturing apparatus is an image capturing apparatus having the largest vertical resolution among the second other image capturing apparatuses.
Then, a window layout area of the fifth image capturing apparatus is determined based on the resolution of the fifth image capturing apparatus, wherein the window layout area of the fifth image capturing apparatus is adjacent to the window layout area of the second image capturing apparatus.
After the window layout area of the image pickup apparatus 4 is determined, if there is no image pickup apparatus having a vertical resolution equal to or smaller than the difference between the vertical resolution of the image pickup apparatus 1 and the vertical resolution of the image pickup apparatus 4 among the remaining image pickup apparatuses of the type a, the window layout area of the image pickup apparatus having the largest vertical resolution is disposed adjacent to the window layout area of the image pickup apparatus 4 among the remaining image pickup apparatuses of the type a.
In the above-described embodiments, it is described how to determine the window layout area of the image capturing apparatus of the first image size type in the case where the image capturing apparatus of the at least two image capturing apparatuses includes at least two first image size types. In some cases, the at least two image capturing apparatuses further include an image capturing apparatus of the second image size type, and how to determine the window layout area of the image capturing apparatus of the second image size type will be described below with reference to the drawings.
Step 410, determining a first layout area corresponding to the first image size type in the layout areas based on the window layout areas of the image capturing apparatuses of the at least two first image size types.
The first layout area includes window layout areas of at least two image capturing apparatuses of the first image size type.
The respective window layout areas of the image capturing apparatuses of the type a are exemplified, and thus the first layout area 81 corresponding to the type a is determined based on the spatial domain of the respective window layout areas of the image capturing apparatuses of the type a. Wherein the height of the first layout area 81 is determined by the vertical resolution of the image capturing apparatus 1, and the width of the first layout area 81 is determined by the horizontal resolutions of the image capturing apparatus 1, the image capturing apparatus 2, the image capturing apparatus 3, the image capturing apparatus 4, and the image capturing apparatus 7.
In step 411, it is determined whether there is a free area in the first layout area.
The free area is an area other than the window layout areas of the image capturing apparatuses of the at least two first image size types in the first layout area.
There are free areas below the image pickup apparatus 6, the image pickup apparatus 4, and the image pickup apparatus 7, which are free area C, free area D, and free area E, respectively.
In the case where there is no image pickup apparatus of the second image size type, the window layout area of the image pickup apparatus of the first image size type may be stretched, exemplified by a procedure how the window layout area of the image pickup apparatus of the a type is stretched in the case where there are C, D, E free areas and there is no image pickup apparatus of the B type.
Since there is no image capturing apparatus of type B, for the free area C and the free area D, the window layout area of the image capturing apparatus 6 may be first stretched to the bottom of the first layout area by an equal amount, and if there is still a gap, the window layout area of the image capturing apparatus 6 may be filled up by a frame adsorption manner, so that the free area C and the free area D are filled up.
While for the free area E, the window layout area of the rightmost image pickup apparatus 7 may be subjected to the equal-ratio stretching.
Alternatively, after the stretching is completed, the window layout areas in the entire first layout area may be equally stretched, so that the window layout areas of the respective image capturing apparatuses fill the entire first layout area and the entire second layout area.
Step 412 determines a window layout area of the second image size type image capturing apparatus based on the first number of free areas and the second number of second image size type image capturing apparatuses.
In the case where there is an image capturing apparatus of the second image size type, the first number of free areas and the second number of image capturing apparatuses of the second image size type are compared, so that a window layout area of the image capturing apparatus of the second image size type is determined based on the first number and the second number. Specific implementations are described below with respect to the first number and the second number of different relationship types in conjunction with the drawings.
In the case where the first number is greater than or equal to the second number, the window layout areas of the image capturing apparatuses of the second image size type may be determined in order of the free areas from large to small.
Specifically, in the case where the first number is greater than or equal to the second number, a larger free area may be preferentially regarded as a window layout area of the image capturing apparatus of the second image size type. Wherein the resolution of the image capturing apparatus of the second image size type may be equally stretched to accommodate the corresponding window layout area.
An example is how to determine a window layout area of one B-type image pickup apparatus in the case where there are C, D, E three free areas and only one B-type image pickup apparatus exists.
Since there is only one type B image pickup apparatus, the largest free area E in C, D, E is determined as the window layout area of the type B image pickup apparatus.
For the free area C and the free area D, the window layout area of the image capturing apparatus 6 may be first stretched to the bottom of the first layout area by an equal amount, and if there is a gap, the window layout area of the image capturing apparatus 6 may be filled with the free area C and the free area D by a frame adsorption method.
An example is how to determine the window layout areas of three B-type image capturing apparatuses in the case where there are C, D, E free areas and three B-type image capturing apparatuses.
For three B-type image pickup apparatuses of the image pickup apparatus 8, the image pickup apparatus 9, and the image pickup apparatus 10, an image pickup apparatus whose vertical resolution is most matched with the height of each free area is determined among the three B-type image pickup apparatuses based on the height of each free area, so that the corresponding free area is determined as a window layout area of the corresponding image pickup apparatus. The window layout area of the image pickup apparatus 8 is an area C, the window layout area of the image pickup apparatus 9 is an area D, and the window layout area of the image pickup apparatus 10 is an area E. In this case, there is no stretching of all window layout areas.
In the case where the first number is smaller than the second number, it is first necessary to determine the first number of sixth image capturing apparatuses, and a seventh image capturing apparatus other than the sixth image capturing apparatus, from among the image capturing apparatuses of the second image size type, based on the resolution of the image capturing apparatuses of the second image size type.
Then, the free area is determined as a window layout area of the sixth image capturing apparatus, and the window layout area of the seventh image capturing apparatus is determined based on the resolution of the seventh image capturing apparatus.
Specifically, for each free area, an image pickup apparatus that most matches the height of the free area may be determined based on the vertical resolution of the image pickup apparatus of the second image size type, and then the free area may be determined as the window layout area of the corresponding image pickup apparatus. In this way, the window layout area of the sixth image capturing apparatus can be determined.
For the seventh image capturing apparatus, first, a second layout area corresponding to the second image size type is determined in the layout area, the second layout area being adjacent to the first layout area. Then, based on the resolution of the seventh image capturing apparatus, in the second layout area, a window layout area of the seventh image capturing apparatus is determined.
Illustratively, the seventh image pickup apparatus includes the image pickup apparatus 11 and the image pickup apparatus 12, and the vertical resolution of the image pickup apparatus 11 is larger than that of the image pickup apparatus 12, and thus the window layout area of the image pickup apparatus 11 is first determined. The window layout area of the image pickup apparatus 11 is adjacent to the window layout area of the image pickup apparatus 1.
Then, a window layout area of the image pickup apparatus 12 is determined, the window layout area of the image pickup apparatus 12 being adjacent to the window layout area of the image pickup apparatus 11.
In some cases, if the window layout area of the image pickup apparatus 12 does not reach the boundary of the second layout area, it is possible to determine first a first difference between the window layout area of the image pickup apparatus 12 and the bottom of the second layout area and a second difference between the window layout area and the rightmost side of the second layout area, perform equal stretching with reference to a smaller value between the first difference and the second difference, and then perform frame suction to the other side. The width of the window layout area of the image pickup apparatus 12 may be first supplemented so as to be aligned with the rightmost side of the first layout area, and then the height of the same length may be supplemented downward, so that the same amount of stretching is not distorted. If the window layout area of the image pickup apparatus 12 after the replenishment is completed is not yet aligned with the height of the window layout area of the image pickup apparatus 11, the downward stretching is continued until it is uniform. In this case, stretching causes a certain picture distortion, but the degree of picture distortion is small.
In some cases, if the width of the window layout area of the image pickup apparatus 12 exceeds the boundary of the second layout area, it is necessary to reduce the width of the window layout area of the image pickup apparatus 12 to be consistent with the rightmost side of the first layout area. The width of the window layout area of the image pickup apparatus 12 may be compressed so as to be aligned with the rightmost side of the first layout area, and then frame suction may be performed downward so that the window layout area of the image pickup apparatus 12 is aligned with the height of the window layout area of the image pickup apparatus 11. In this case, stretching causes a certain picture distortion, but the degree of picture distortion is small.
Step 413, determining respective pane areas of the at least two image capturing apparatuses from the video playback window based on respective window layout areas of the at least two image capturing apparatuses and the size of the video playback window.
After obtaining the respective window layout areas of the at least two image capturing apparatuses, a ratio between the window layout areas of the respective image capturing apparatuses and the video playing window may be determined based on the respective window layout areas of the at least two image capturing apparatuses and the size of the video playing window, and then a pane area of the respective image capturing apparatuses may be determined based on the ratio.
First, a width and a height of a layout area are determined, wherein the width of the layout area is equal to a sum of horizontal resolutions of all image capturing apparatuses of the first image size type of the uppermost layer in the first layout area. In the case where only the image capturing apparatus of the first image size type is included in the at least two image capturing apparatuses, the height of the layout area is equal to the vertical resolution of the first image capturing apparatus; in the case where the image pickup apparatuses of the first image size type and the image pickup apparatuses of the second image size type are included in at least two image pickup apparatuses, the height of the layout area is equal to the vertical resolution of the first image pickup apparatus plus the vertical resolution of the leftmost image pickup apparatus of the second image size type.
Illustratively, the width of the layout area is equal to the sum of the horizontal resolutions of the image pickup apparatus 1, the image pickup apparatus 2, the image pickup apparatus 3, the image pickup apparatus 4, and the image pickup apparatus 7, and the height of the layout area is equal to the sum of the vertical resolutions of the image pickup apparatus 1 and the image pickup apparatus 11.
After the height and width of the layout area are obtained, the width scaling m and the height scaling n can be obtained by combining the size of the video playing window. Where m=width of video playback window/width of layout area, n=height of video playback window/height of layout area.
For each image capturing apparatus, the width of the pane area of the image capturing apparatus can be obtained based on the width and m of the window layout area of the image capturing apparatus, and the height of the pane area of the image capturing apparatus can be obtained based on the height and n of the window layout area of the image capturing apparatus.
Then, a corresponding coordinate system is established, and coordinates of the pane areas of the respective image capturing apparatuses are determined based on the height and width of the pane areas of the respective image capturing apparatuses and the relative positional relationship between the pane areas of the respective image capturing apparatuses. Then, windowing is performed based on the coordinates of the pane area of each image pickup apparatus, and video captured by the corresponding image pickup apparatus is played in the pane area of each image pickup apparatus. Wherein, the play mode can be adapted to scale play or full scale play. Specifically, for each image pickup apparatus, an error value due to image stretching may be calculated based on the aspect ratio of the pane area of the image pickup apparatus and the resolution aspect ratio of the image pickup apparatus, and compared with a preset error threshold. If the error value is greater than or equal to a preset error threshold value, selecting a proportional play mode; and if the error value is smaller than the preset error threshold value, selecting a full-scale playing mode.
In summary, according to the scheme of the embodiment of the application, the image pickup devices are classified according to the resolution aspect ratio of the image pickup devices, the window layout area of each image pickup device is determined in a recursion mode, the idle area is filled in an equivalent stretching and scaling mode, absolute coordinates of a pane area corresponding to the regularly arranged window layout area are calculated, the window is customized and rationalized, and a self-adaptive playing mode is performed, so that the image pickup devices with different resolutions play videos under the condition that the pane area is close to the real resolution ratio, the picture distortion of the videos is reduced, the utilization rate of playing panes is also improved, and the video display effect is improved.
The embodiment of the application improves the degree of intellectualization and realizes the intelligent campus digital management of the multidimensional digital image.
Fig. 2 is a schematic flow chart of a method for digitizing a multi-dimensional digital image on a smart campus, where the method for digitizing the multi-dimensional digital image on the smart campus includes:
Step 101, acquiring video stream data to be processed in a campus area;
102, identifying video stream data to be processed to obtain user behavior data and map data to be processed;
Step 103, determining original user behavior information based on node coordinates of all limb nodes of the hand, and determining an original three-dimensional entity model of the building based on map data to be processed;
104, digitizing the original user behavior information and the original three-dimensional entity model to obtain digitized user behavior information and a digitized three-dimensional entity model;
And 105, performing multidimensional digital image management based on the digitized user behavior information and the digitized three-dimensional entity model.
The embodiment of the application collects video stream data to be processed in a campus area, identifies the video stream data to be processed to obtain user behavior data and map data to be processed, determines original user behavior information based on node coordinates of each limb node of a hand, determines an original three-dimensional entity model of a building based on the map data to be processed, digitizes the original user behavior information and the original three-dimensional entity model to obtain digitized user behavior information and a digitized three-dimensional entity model, and performs multidimensional digital image management based on the digitized user behavior information and the digitized three-dimensional entity model.
The embodiment of the application improves the degree of intellectualization and realizes the intelligent campus digital management of the multidimensional digital image.
Fig. 3 illustrates a physical schematic diagram of an electronic device, as shown in fig. 3, the electronic device may include: processor 310, communication interface (Communications Interface) 320, memory 330 and communication bus 340, wherein processor 310, communication interface 320 and memory 330 communicate with each other via communication bus 340. Processor 310 may invoke logic instructions in memory 330 to perform a multi-dimensional digital image smart campus digitizing method comprising:
Collecting video stream data to be processed in a campus area;
Identifying the video stream data to be processed to obtain user behavior data and map data to be processed; the user behavior data comprise node coordinates of all limb nodes of the hand;
determining original user behavior information based on node coordinates of all limb nodes of the hand, and determining an original three-dimensional entity model of the building based on map data to be processed;
Digitizing the original user behavior information and the original three-dimensional entity model to obtain digitized user behavior information and a digitized three-dimensional entity model;
And carrying out multidimensional digital image management based on the digitized user behavior information and the digitized three-dimensional entity model.
Further, the logic instructions in the memory 330 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
It will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus a necessary general hardware platform, but may also be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.
Claims (8)
1. The multidimensional digital image intelligent campus digitizing system is characterized by comprising a digitizing control center, a video acquisition unit, a video identification unit, an image construction unit, a digitizing unit, a digital image management unit, a video encoding unit, a video cache unit and a video playing unit; the digital control center is respectively connected with the video acquisition unit, the video identification unit, the portrait construction unit, the digital portrait management unit, the video coding unit, the video cache unit and the video playing unit and controls the digital portrait management unit, the video coding unit, the video cache unit and the video playing unit;
The video acquisition unit is used for: collecting video stream data to be processed in a campus area;
The video identification unit is used for: identifying the video stream data to be processed to obtain user behavior data and map data to be processed; the user behavior data comprise node coordinates of all limb nodes of the hand;
The portrait construction unit is used for: determining original user behavior information based on node coordinates of all limb nodes of the hand, and determining an original three-dimensional entity model of the building based on the map data to be processed;
The digitizing unit is used for: digitizing the original user behavior information and the original three-dimensional entity model to obtain digitized user behavior information and a digitized three-dimensional entity model;
The digital portrait management unit is used for: carrying out multidimensional digital image management based on the digitized user behavior information and the digitized three-dimensional entity model;
the video encoding unit is used for: encoding the video stream data to be processed;
The video buffering unit is used for: storing the video stream data to be processed;
the video playing unit is used for: playing the video stream data to be processed;
wherein the determining the original three-dimensional solid model of the building based on the map data to be processed comprises:
deleting redundant data in the map data to be processed to obtain target map data;
Screening a plurality of building panels related to a building from the target map data based on attributes of each panel in the target map data;
dividing the corresponding building type of each building panel based on the attribute of each building panel and the area of the building panel, and determining the building type of each building; the building types include: residential and non-residential buildings;
determining building parameter information based on the building type of each building panel in the plurality of building panels, and generating a three-dimensional entity model to be processed of each building based on the building parameter information; the building parameter information includes: the floor height of the residential building is not the floor height of the residential building;
Dividing the surface of the three-dimensional solid model to be processed of each building into a top surface and a vertical surface, and respectively generating different materials and textures for the top surface and the vertical surface of the three-dimensional solid model to be processed;
Adjusting the model texture mapping coordinates of the three-dimensional entity model to be processed of each building based on a unified texture coordinate system, and generating an original three-dimensional entity model of each building;
The determining the building type of each building based on the attribute of each building panel of the plurality of building panels and the area of the building panel to divide the building type corresponding to each building panel comprises:
Deleting building panels with the area smaller than or equal to a first preset threshold value from the plurality of building panels;
determining the building type of each building according to the attributes of the building panels, and adjusting the building type of each building based on a preset adjustment rule;
Wherein, the preset adjustment rule includes: adjusting the building type of the building panel with the area being larger than a second preset threshold value and the building type being a residential building to be a non-residential building, and adjusting the building type of the building panel with the area being smaller than or equal to the second preset threshold value and the area being larger than the first preset threshold value to be a residential building; the first preset threshold is smaller than the second preset threshold.
2. A method for digitizing a multi-dimensional digital image on a smart campus, applied to the multi-dimensional digital image on a smart campus digitizing system of claim 1, comprising:
Collecting video stream data to be processed in a campus area;
identifying the video stream data to be processed to obtain user behavior data and map data to be processed; the user behavior data comprise node coordinates of all limb nodes of the hand;
Determining original user behavior information based on node coordinates of all limb nodes of the hand, and determining an original three-dimensional entity model of the building based on the map data to be processed;
Digitizing the original user behavior information and the original three-dimensional entity model to obtain digitized user behavior information and a digitized three-dimensional entity model;
carrying out multidimensional digital image management based on the digitized user behavior information and the digitized three-dimensional entity model;
wherein the determining the original three-dimensional solid model of the building based on the map data to be processed comprises:
deleting redundant data in the map data to be processed to obtain target map data;
Screening a plurality of building panels related to a building from the target map data based on attributes of each panel in the target map data;
dividing the corresponding building type of each building panel based on the attribute of each building panel and the area of the building panel, and determining the building type of each building; the building types include: residential and non-residential buildings;
determining building parameter information based on the building type of each building panel in the plurality of building panels, and generating a three-dimensional entity model to be processed of each building based on the building parameter information; the building parameter information includes: the floor height of the residential building is not the floor height of the residential building;
Dividing the surface of the three-dimensional solid model to be processed of each building into a top surface and a vertical surface, and respectively generating different materials and textures for the top surface and the vertical surface of the three-dimensional solid model to be processed;
Adjusting the model texture mapping coordinates of the three-dimensional entity model to be processed of each building based on a unified texture coordinate system, and generating an original three-dimensional entity model of each building;
The determining the building type of each building based on the attribute of each building panel of the plurality of building panels and the area of the building panel to divide the building type corresponding to each building panel comprises:
Deleting building panels with the area smaller than or equal to a first preset threshold value from the plurality of building panels;
determining the building type of each building according to the attributes of the building panels, and adjusting the building type of each building based on a preset adjustment rule;
Wherein, the preset adjustment rule includes: adjusting the building type of the building panel with the area being larger than a second preset threshold value and the building type being a residential building to be a non-residential building, and adjusting the building type of the building panel with the area being smaller than or equal to the second preset threshold value and the area being larger than the first preset threshold value to be a residential building; the first preset threshold is smaller than the second preset threshold.
3. The method of claim 2, wherein the determining building parameter information based on the building type of each of the plurality of building panels comprises:
randomly selecting a first building layer height from a first random range as the building layer height of a building corresponding to a building panel of a residential building of a building type, and randomly selecting a second building layer height from a second random range as the building layer height of a building corresponding to a building panel of a non-residential building of a building type;
Wherein any value in the first random range is greater than any value in the second random range; the first building level is greater than the second building level.
4. The multi-dimensional digital image smart campus digitizing method of claim 2, further comprising:
Acquiring a real foreground video and a pure background image of the video stream data to be processed, obtaining a video to be encoded based on the aliasing of the real foreground video and the pure background image, and adding noise matched with the real foreground video to a background area of the video to be encoded;
Determining a current frame of the video to be encoded;
Setting the clean background image as a first frame under the condition that the current frame belongs to the first image group in the video to be encoded, and encoding the clean background image to obtain encoding information, and copying the encoding information of the first frame in the previous image group of the current frame as the encoding information of the first frame in the current image group under the condition that the current frame does not belong to the first image group in the video to be encoded;
dividing the coding tree of the subsequent frames of the first frame of all the image groups of the video to be coded to obtain a coding tree;
And determining the coding frame of the video to be coded based on the coding information of the first frame of each image group and the coding tree of each image group.
5. The method for smart campus digitizing multi-dimensional digital video of claim 4, wherein the determining the encoded frame of the video to be encoded based on the encoded information of the first frame of each group of images and the encoded tree of each group of images comprises:
Under the condition that the coding tree of each image group only comprises the background area of the video to be coded, carrying out predictive coding on the coding tree of each image group to obtain a predictive residual error and a noise signal ratio;
coding the prediction residual error based on the noise signal ratio to obtain a target coding frame;
and determining the coding frame of the video to be coded based on the coding information of the first frame of each image group and the target coding frame.
6. The multi-dimensional digital image smart campus digitizing method of claim 2, further comprising:
responding to playback operation of the video stream data to be processed, segmenting a period from a playing time point of the video stream data to be processed to an ending time point of the video stream data to be processed, and obtaining a first period and a second period;
Sequentially caching a plurality of first video frames in the first period according to the time sequence;
Based on a preset caching strategy, caching a plurality of second video frames in the second period; wherein the plurality of second video frames include a plurality of intra-coded frames and a plurality of intra-prediction coded frames, and the preset buffering policy includes: the buffer priority of the plurality of intra-coded frames is higher than the buffer priority of the plurality of intra-prediction coded frames.
7. The multi-dimensional digital image smart campus digitizing method of claim 2, further comprising:
Determining the size of a video playing window, and playing the respective resolutions of at least two camera devices of the video stream data to be processed through the video playing window;
determining respective pane areas of the at least two image capturing apparatuses from the video playback window based on respective resolutions of the at least two image capturing apparatuses and sizes of the video playback window;
in each of the pane areas, a video corresponding to the image capturing apparatus is played.
8. The multi-dimensional digital video smart campus digitizing method of claim 7, wherein the determining the respective pane areas of the at least two image capturing devices from the video playback window based on the respective resolutions of the at least two image capturing devices and the size of the video playback window comprises:
determining, for each image capturing apparatus, a size type of an image captured by the image capturing apparatus based on a resolution of the image capturing apparatus;
determining respective window layout areas of the at least two image capturing apparatuses in the layout areas corresponding to the video playing windows based on respective resolutions and image size types of the at least two image capturing apparatuses;
and determining respective pane areas of the at least two image capturing apparatuses from the video playing window based on respective window layout areas of the at least two image capturing apparatuses and the size of the video playing window.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311787565.4A CN117710893B (en) | 2023-12-25 | 2023-12-25 | Multidimensional digital image intelligent campus digitizing system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311787565.4A CN117710893B (en) | 2023-12-25 | 2023-12-25 | Multidimensional digital image intelligent campus digitizing system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117710893A CN117710893A (en) | 2024-03-15 |
CN117710893B true CN117710893B (en) | 2024-05-10 |
Family
ID=90149602
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311787565.4A Active CN117710893B (en) | 2023-12-25 | 2023-12-25 | Multidimensional digital image intelligent campus digitizing system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117710893B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118171782B (en) * | 2024-05-13 | 2024-07-16 | 成都理工大学工程技术学院 | Automobile noise prediction method and system |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101030938A (en) * | 2007-02-05 | 2007-09-05 | 北京大学 | System and method for organizing and transmitting QoS self-adaptive P2P flow medium data |
CN110111180A (en) * | 2019-04-30 | 2019-08-09 | 贝壳技术有限公司 | A kind of house decoration service push method and device |
WO2019242174A1 (en) * | 2018-06-21 | 2019-12-26 | 华南理工大学 | Method for automatically detecting building structure and generating 3d model based on laser radar |
CN111669600A (en) * | 2020-06-05 | 2020-09-15 | 浙江大华技术股份有限公司 | Video coding method, video coding device, video coder and storage device |
CN113672788A (en) * | 2021-07-22 | 2021-11-19 | 东南大学 | Urban building function classification method based on multi-source data and weight coefficient method |
CN114489546A (en) * | 2022-01-30 | 2022-05-13 | 深圳创维-Rgb电子有限公司 | Split screen display method, electronic equipment, storage medium and device |
CN115004273A (en) * | 2019-04-15 | 2022-09-02 | 华为技术有限公司 | Digital reconstruction method, device and system for traffic road |
CN116055726A (en) * | 2023-02-03 | 2023-05-02 | 红河学院 | Low-delay layered video coding method, computer equipment and medium |
CN116310192A (en) * | 2022-12-28 | 2023-06-23 | 江苏省测绘研究所 | Urban building three-dimensional model monomer reconstruction method based on point cloud |
CN116647644A (en) * | 2023-06-06 | 2023-08-25 | 上海优景智能科技股份有限公司 | Campus interactive monitoring method and system based on digital twin technology |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11868160B2 (en) * | 2022-02-09 | 2024-01-09 | Microsoft Technology Licensing, Llc | Just-in-time snap layouts |
-
2023
- 2023-12-25 CN CN202311787565.4A patent/CN117710893B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101030938A (en) * | 2007-02-05 | 2007-09-05 | 北京大学 | System and method for organizing and transmitting QoS self-adaptive P2P flow medium data |
WO2019242174A1 (en) * | 2018-06-21 | 2019-12-26 | 华南理工大学 | Method for automatically detecting building structure and generating 3d model based on laser radar |
CN115004273A (en) * | 2019-04-15 | 2022-09-02 | 华为技术有限公司 | Digital reconstruction method, device and system for traffic road |
CN110111180A (en) * | 2019-04-30 | 2019-08-09 | 贝壳技术有限公司 | A kind of house decoration service push method and device |
CN111669600A (en) * | 2020-06-05 | 2020-09-15 | 浙江大华技术股份有限公司 | Video coding method, video coding device, video coder and storage device |
CN113672788A (en) * | 2021-07-22 | 2021-11-19 | 东南大学 | Urban building function classification method based on multi-source data and weight coefficient method |
CN114489546A (en) * | 2022-01-30 | 2022-05-13 | 深圳创维-Rgb电子有限公司 | Split screen display method, electronic equipment, storage medium and device |
CN116310192A (en) * | 2022-12-28 | 2023-06-23 | 江苏省测绘研究所 | Urban building three-dimensional model monomer reconstruction method based on point cloud |
CN116055726A (en) * | 2023-02-03 | 2023-05-02 | 红河学院 | Low-delay layered video coding method, computer equipment and medium |
CN116647644A (en) * | 2023-06-06 | 2023-08-25 | 上海优景智能科技股份有限公司 | Campus interactive monitoring method and system based on digital twin technology |
Non-Patent Citations (1)
Title |
---|
基于多源数据融合的农村建筑智能识别与三维建模方法研究;陈彪;《热带地理》;20230222;第43卷(第02期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN117710893A (en) | 2024-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110515452B (en) | Image processing method, image processing device, storage medium and computer equipment | |
CN112203095B (en) | Video motion estimation method, device, equipment and computer readable storage medium | |
CN117710893B (en) | Multidimensional digital image intelligent campus digitizing system | |
KR100938964B1 (en) | System and method for compression of 3D computer graphics | |
US9024947B2 (en) | Rendering and navigating photographic panoramas with depth information in a geographic information system | |
JP7385640B2 (en) | multi-resolution imaging system | |
US20050063596A1 (en) | Encoding of geometric modeled images | |
Würmlin et al. | 3D Video Recorder: a System for Recording and Playing Free‐Viewpoint Video | |
CN110889888B (en) | Three-dimensional model visualization method integrating texture simplification and fractal compression | |
CA2556896A1 (en) | Adaptive 3d image modelling system and apparatus and method therefor | |
CN112906125B (en) | Light-weight loading method for BIM model of railway fixed facility | |
Okamoto et al. | Image-based network rendering of large meshes for cloud computing | |
CN114399692A (en) | Illegal construction identification monitoring detection method and system based on deep learning | |
US10607394B2 (en) | Methods and systems for volumetric reconstruction based on a confidence field | |
Tu et al. | Content adaptive tiling method based on user access preference for streaming panoramic video | |
JPH0984002A (en) | Processing method and system of digital picture | |
CN117934705A (en) | Building model batch generation method and device, readable storage medium and electronic equipment | |
Chai et al. | A depth map representation for real-time transmission and view-based rendering of a dynamic 3D scene | |
KR20060015755A (en) | Method of representing a sequence of pictures using 3d models, and corresponding devices and signal | |
Wegen et al. | Concepts and challenges for 4D point clouds as a foundation of conscious, smart city systems | |
CN117581549A (en) | Intra-frame prediction and encoding and decoding methods and devices, encoder and decoder, equipment and medium | |
Wilson et al. | A video-based rendering acceleration algorithm for interactive walkthroughs | |
WO2001049028A1 (en) | Scene model generation from video for use in video processing | |
US11501468B2 (en) | Method for compressing image data having depth information | |
Trinh | Efficient Stereo Algorithm using Multiscale Belief Propagation on Segmented Images. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |