CN115134581A - Fusion reproduction method, system, equipment and storage medium of image and sound - Google Patents

Fusion reproduction method, system, equipment and storage medium of image and sound Download PDF

Info

Publication number
CN115134581A
CN115134581A CN202211043746.1A CN202211043746A CN115134581A CN 115134581 A CN115134581 A CN 115134581A CN 202211043746 A CN202211043746 A CN 202211043746A CN 115134581 A CN115134581 A CN 115134581A
Authority
CN
China
Prior art keywords
image information
sound
information
target object
visual angle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211043746.1A
Other languages
Chinese (zh)
Inventor
陈政权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Zhongsheng Matrix Technology Development Co ltd
Original Assignee
Sichuan Zhongsheng Matrix Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Zhongsheng Matrix Technology Development Co ltd filed Critical Sichuan Zhongsheng Matrix Technology Development Co ltd
Priority to CN202211043746.1A priority Critical patent/CN115134581A/en
Publication of CN115134581A publication Critical patent/CN115134581A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/366Image reproducers using viewer tracking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/398Synchronisation thereof; Control thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/13Aspects of volume control, not necessarily automatic, in stereophonic sound systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention is suitable for the technical field of images and sounds, and provides a method, a system, equipment and a storage medium for fusion reproduction of images and sounds, wherein the method for fusion reproduction of images and sounds comprises the following steps: step S100: acquiring image information and sound information of a target object under multiple visual angles; step S200: based on the visual angle, establishing a mapping relation between the image information and the sound information under different visual angles; step S300: and acquiring an input visual angle, determining image information and sound information corresponding to the input visual angle according to the mapping relation, and reproducing the image information and the sound information corresponding to the input visual angle. The method can truly present the three-dimensional image information and the sound information of the target object under different visual angles.

Description

Fusion reproduction method, system, equipment and storage medium of image and sound
Technical Field
The present invention relates to the field of image and sound reproduction, and in particular, to a method, system, device, and storage medium for fusion reproduction of images and sounds.
Background
The places where the sound and the picture which can be seen simultaneously in the prior art are displayed together are a display terminal and a cinema, and visual and auditory enjoyment is brought to audiences.
At present, most of the existing sound presentation methods are mono or stereo, and stereo usually refers to only two audio channels played by using two speakers or headphones, but these methods cannot well present the reality of sound in reality to users, especially in the field of audio and video applications such as movie theaters and games, and audiences at different positions in movie theaters hear the same sound effect, but actually in movie theaters, audiences at different positions hear the different sound effect.
In the prior art, an object may be displayed in a two-dimensional picture manner or in a three-dimensional (stereo) manner, and the three-dimensional manner is usually displayed in a three-dimensional modeling manner or in a surface or point cloud manner using data obtained by a three-dimensional scanning technique.
However, when three-dimensional display of an object is performed by modeling or three-dimensional scanning techniques, the following problems occur: (1) the object can lose the carried information in the process of three-dimensional display, the real and all information of the object cannot be completely displayed, and the displayed three-dimensional object is not the three-dimensional display of the object in the real sense, so that illusion is easily generated on the visual field, the object is displayed in a three-dimensional mode, and all information of the object is displayed. (2) The prior art also has the problem that the display of some objects through scanning/modeling cannot be realized, for example: the objects are flames, smoke, water drops, plasmas and the like, and have the characteristics of high light, high reflection and high penetration, and the three-dimensional recording of the objects cannot be realized through the current three-dimensional scanning technology, because the method is analyzed from the technical principle, the path and the traditional three-dimensional reconstruction technology system, and the details, the physical properties, the reaction information between the information of the objects and the environment and the like can be lost.
In summary, in the prior art, regarding that a picture cannot realize real three-dimensional recording of an object and cannot realize real display of sound, a fusion display method of sound and a picture in the prior art is only based on time synchronization display, and the two do not realize real fusion, and are separated substantially, sound cannot change synchronously with the change of the picture, the picture cannot change synchronously with the change of the sound, and information about the reality of the object is not displayed.
Disclosure of Invention
The present invention aims to provide a method, a system, a device and a storage medium for fusion reproduction of images and sound, which are used for solving the above technical problems in the prior art, and mainly comprise the following four aspects:
the first aspect of the present application provides a method for fusion reproduction of images and sound, comprising the steps of:
step S100: acquiring image information and sound information of a target object under multiple visual angles;
step S200: based on the visual angle, establishing a mapping relation between the image information and the sound information under different visual angles;
step S300: and acquiring an input visual angle, determining image information and sound information corresponding to the input visual angle according to the mapping relation, and reproducing the image information and the sound information corresponding to the input visual angle.
Further, still include:
and acquiring image information and sound information of the target object under the same visual angle, wherein a mapping relation is formed between the image information and the sound information.
Further, comprising:
calculating the maximum boundary radius of the target object under the view angle, and taking the edge formed by the maximum boundary radius as the view angle boundary;
and acquiring image information of the target object in the view angle boundary under a plurality of view angles.
Further, comprising:
acquiring input sound information, analyzing a visual angle corresponding to the input sound information, and reproducing image information of the visual angle corresponding to the sound information;
acquiring input image information, analyzing a visual angle corresponding to the image information, and reproducing sound information of the visual angle corresponding to the image information.
Further, the sound information includes at least one track:
reproducing, when the sound information includes one audio track, image information corresponding to the audio track;
when the sound information includes a plurality of tracks, an input track is acquired, image information corresponding to the input track is analyzed, and image information corresponding to the input track is reproduced.
The second aspect of the present application provides a system for fusion and reproduction of images and sound, further comprising the following modules:
an acquisition module: the system comprises a display device, a display device and a display device, wherein the display device is used for acquiring image information and sound information of a target object under multiple visual angles;
a mapping module: the system is used for establishing a mapping relation between the image information and the sound information under different visual angles on the basis of the visual angles;
a reproduction module: and the system is used for acquiring the input visual angle, determining the image information and the sound information corresponding to the input visual angle according to the mapping relation, and reproducing the image information and the sound information corresponding to the input visual angle.
Further, the mapping module is to:
and acquiring image information and sound information of the target object under the same visual angle, wherein the image information and the sound information form a mapping relation.
Further, the obtaining module is further configured to:
calculating the maximum boundary radius of the target object under the view angle, and taking the edge formed by the maximum boundary radius as the view angle boundary;
and acquiring image information of the target object in the view angle boundary under a plurality of view angles.
A third aspect of the present application provides an electronic device comprising:
one or more processors;
a reservoir;
a screen for displaying images and sounds in the above method;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the methods as described above.
A fourth aspect of the present application provides a computer-readable storage medium,
the computer readable storage medium has stored therein program code that is invoked by a processor to perform the method described above.
Compared with the prior art, the invention at least has the following technical effects:
(1) according to the image and sound fusion reproduction method provided by the application, the image information and the sound information of the target object under a plurality of visual angles are acquired, the mapping relation between the image information and the sound information under different visual angles is established on the basis of the visual angles, the input visual angles are acquired, the image information and the sound information corresponding to the input visual angles are determined according to the established mapping relation, the image information and the sound information corresponding to the input visual angles are fused and reproduced, and the three-dimensional image information and the sound information of the target object under different visual angles can be truly presented.
(2) By the method, the image information and the sound information of the target object under different visual angles can be reproduced at any time, further, the sound information of the target object can change along with the change of the image information during reproduction, and for example, the sound information can change along with the change conditions of visual angles such as picture magnification and reduction, distance, rotation and translation and the like in the image information; meanwhile, the image information of the target object can also change along with the change of the sound information, for example, the image information can display the corresponding picture information according to the switching of the audio track in the sound information, the image information and the sound information of the target object under different visual angles can be truly reproduced, the real fusion between the sound information and the image information is realized, and the target object can be truly displayed.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention or the description of the prior art will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of an image and sound fusion method according to the present invention;
FIG. 2 is a first diagram illustrating a mapping relationship between sound information and image information in the present invention;
FIG. 3 is a second diagram illustrating a mapping relationship between sound information and image information according to the present invention;
FIG. 4 is a schematic diagram of an electronic device according to the present invention;
fig. 5 is a schematic structural diagram of a computer-readable storage medium in the present invention.
Detailed Description
The following description provides many different embodiments, or examples, for implementing different features of the invention. The particular examples set forth below are illustrative only and are not intended to be limiting.
At present, in the prior art, a method adopted during fusion display of pictures and sound is based on time, synchronous display is carried out according to a time coordinate axis as a guide, illusion is easily caused to people, the pictures and the sound are reproduced after being watched, actually, mapping relation does not exist between the pictures and the sound, the pictures and the sound are displayed separately, connection relation does not exist among the pictures, the sound and a visual angle, the sound and the pictures can not be carried out synchronously during display, relation and effect between real sound and pictures can not be displayed, and experience feeling brought to users is poor.
It should be noted that the target object in the present application may be a movie theater, a concert, a fashion show, a soccer game, a basketball game, etc., and all scenes with pictures and sounds may be regarded as the target object in the present application, so that the target object in the present application is not particularly limited.
For example, when the target object is a concert, with the playing area as the origin, the effect shown in the prior art is that the playing effect heard by the audiences located at different positions is the same, for example, the playing effect of the concert heard and seen by the audiences located at different positions of two adjacent rows is the same; the spectators are located at the same position, the effect of the playing is the same when the spectators hear the playing towards the playing area and the playing back towards the spectators, however, in the actual situation, because the spectators have the left channel and the right channel, when the spectators hear the playing back towards the playing area and the playing back towards the playing area, the positions of the left channel and the right channel are exchanged, and the playing effect heard by the spectators before and after the exchange is different; in addition to this, when a concert is viewed at different viewing angles, the picture viewed and the sound heard are also different.
In order to present the real effect of the picture and sound of the target object under different viewing angles, the present application provides a method for generating fusion of image and sound, and the following describes a method for reproducing image and sound related to the present application.
The first embodiment is as follows:
as shown in fig. 1 to fig. 3, an embodiment of the present application provides a method for fusion reproduction of images and sound, including the following steps:
step S100: acquiring image information and sound information of a target object under multiple visual angles;
step S200: based on the visual angle, establishing a mapping relation between the image information and the sound information under different visual angles;
step S300: and acquiring an input visual angle, determining image information and sound information corresponding to the input visual angle according to the mapping relation, and reproducing the image information and the sound information corresponding to the input visual angle.
Further, the method also comprises the following steps:
and acquiring image information and sound information of the target object under the same visual angle, wherein a mapping relation is formed between the image information and the sound information.
Further, comprising:
calculating the maximum boundary radius of the target object under the view angle, and taking the edge formed by the maximum boundary radius as the view angle boundary;
and acquiring image information of the target object in the view angle boundary under a plurality of view angles.
In the above solution, firstly, image information and sound information of a target object at different viewing angles need to be acquired, where the image information and the sound information may be acquired simultaneously or may be acquired separately, and the following method is introduced to acquire the image information and the sound information:
the image information acquisition method of the target object comprises the following steps:
before the acquisition device acquires the target object, in order to reduce the influence of redundant interference information when the target object is acquired, the view angle boundaries of the target object under different view angles need to be calculated and acquired.
For example, the view angles that the target object needs to acquire include: the method comprises the steps of obtaining an image displayed by a target object under an angle of view 1 when image information of the target object under the angle of view 1 needs to be collected, calculating the center position of the target object in the image displayed under the angle of view 1, calculating the corresponding maximum boundary radius by taking the center position as a center of a circle, and taking the edge of a circle (or a circle or an ellipse) formed by the maximum boundary radius as an angle of view boundary, wherein the target object can be completely contained in the corresponding angle of view boundary through the angle of view boundary formed in the way. For better understanding, assuming that a displayed image of a target object under a view angle 1 is a rectangle, then calculating the center position of the rectangle, taking the center position as a circle center, calculating the maximum boundary radius (circumscribed radius) based on the circle center, namely a circumscribed circle of the rectangle, and taking the edge of the circumscribed circle as a view angle boundary, so that the rectangle is completely contained in the circumscribed circle; the displayed image of the target object under the visual angle 2 is a triangle, then the center position of the triangle is calculated, the center position is used as the circle center, the maximum boundary radius (external radius) based on the circle center is calculated, namely the external circle of the triangle, and the edge of the external circle is used as the visual angle boundary, so that the triangle is completely contained in the external circle;
further, the maximum boundary radius of the target object under each view angle is compared, one of the maximum boundary radii with the longest maximum is selected as the maximum boundary radius of the target, the formed spherical view angle boundary completely wraps the target object, and image information of the target object under each view angle is acquired on the basis; of course, the ellipse (or ellipsoid) edge formed by combining the maximum boundary radii of any two or more view angles can also be used as a view angle boundary, and image information of the target object under each view angle is acquired in the view angle boundary; the displayed image shapes of the target object at different viewing angles may all be the same, may be partially the same, or may all be different.
It should be noted that the capturing device in the present application may be a camera, a virtual camera, etc., as long as it can capture an image of a target object, and is not limited specifically herein.
According to the method and the device, before the image information of the target object is acquired, the maximum boundary radius of the display image of the target object under the corresponding visual angle is calculated, the edge formed by the maximum boundary radius is used as the visual angle boundary, and the display image of the target object under the corresponding visual angle is contained in the visual angle boundary, so that in the acquisition of the image information of the target object, noise information outside the visual angle boundary of the target object under the corresponding visual angle is removed, only needed information is acquired, the influence of other information on subsequent three-dimensional reproduction is avoided, and the information content of the image is also reduced.
Acquiring image information of a target object under a plurality of visual angles in a visual angle boundary, wherein the image information comprises initial image information and/or fusion image information;
the acquiring of the fused image information includes:
acquiring the acquisition distance between an acquisition device and a target object under a corresponding view angle;
dividing the acquisition distance into a plurality of preset acquisition distances along a preset number, and acquiring cutting image information along the preset acquisition distances by an acquisition device;
and fusing the plurality of pieces of cutting image information under the same visual angle to form fused image information.
After the boundary view angle boundary of the target object in the display images under different view angles is calculated, the acquisition device is started to acquire the image information of the target object in the view angle boundary under the corresponding view angle, and the following two modes can be adopted when the image information is acquired:
one is to directly acquire (shoot) initial image information of a target object at different angles of view, the initial image information includes original information carried by the target object itself at the moment of shooting, the image information is information of the target object at the current moment acquired by an acquisition device, and the image is not processed at all.
Another adopted method is that image information acquired under each view angle needs to be subjected to fusion processing, and the specific method is as follows: firstly, acquiring a distance to be acquired between a target object and an acquisition device at each view angle, and then setting a preset number, wherein the preset number refers to the number of acquired distances to be divided, and acquiring corresponding image information (cutting image information) at each division point at the corresponding view angle; and finally, after the cutting image information at each division point under the same visual angle is acquired, fusing all the cutting image information according to a fusion rule to acquire fused image information under the same visual angle. If the view angle 1, the view angle 2, the view angle 3, the view angle 4, or the like is assumed, when the target object is at the view angle 1, the acquisition device and the acquisition distance at the view angle are firstly acquired, then a preset number, such as 3, to be cut is set, and the cut image information at each cutting distance or cutting point is acquired, that is, the cut image information 1 at the cutting distance 1, the cut image information 2 at the cutting distance 2, and the cut image information 3 at the cutting distance 3 are acquired, and the cut image information 1, the cut image information 2, and the cut image information 3 are subjected to a rule according to other rules, such as the cutting distance or the depth of the acquisition device, to form the fused image information of the target object at the view angle 1.
It should be noted that the acquisition distances between the target object and the acquisition devices at different viewing angles may be the same, may be partially the same, and may be different; the preset number of the acquisition distances to be divided between the target object and the acquisition device at different view angles can be the same, can be partially the same or can be different; the segmentation mode of the acquisition distance between the target object and the acquisition device under different visual angles is not limited, and the target object can be uniformly segmented, and can also be densely segmented at places with more image information of the target object and sparsely segmented at places with less image information.
It should be noted that, when image information of the target object within the view angle boundary is acquired, a regular acquisition mode may be adopted, and an irregular acquisition mode may also be adopted; when the image information of the target object in the view angle boundary is acquired, the image information of the target object under different view angles at the same time can be acquired, and the image information of the target object under different view angles within a period of time can also be acquired.
It should be noted that the target object may acquire different image information at different viewing angles, or may acquire the same image information, for example, at viewing angle 1, the initial image information of the target object is acquired, at viewing angle 2, the fused image information of the target object is acquired, or the initial image information and the fused image information may be selected and acquired at the same time at viewing angle 1, which is not limited herein.
It should be noted that the image information obtained by the target object at different viewing angles may be a piece of complete image information or complete image information formed by splicing a plurality of sub-image information.
In this application, through the collection distance between acquisition device and the target object under acquireing same visual angle, then cut apart the collection distance according to predetermined quantity to acquire corresponding cutting image information, fuse a plurality of cutting images. The image information obtained by the method under the same visual angle is clearer, and the carried noise information is less, so that the target object is smoother and faster in three-dimensional reproduction.
Further, after acquiring image information of the target object in the view angle boundary under a plurality of view angles, the method further comprises the following steps:
calculating a preset area in the acquired image information;
and cutting the area outside the preset area to obtain the image information of the preset area.
In the scheme, when initial image information or fused image information of a target object in a view angle boundary under multiple view angles is acquired, when the initial image information is acquired, a preset area in the initial image information is calculated, the preset area is an area where the target object is located, then a segmentation method is adopted to cut an area without the target object, and preset area image information only including the area where the target object is located is acquired; when the fusion image information is acquired, calculating a preset region in each piece of cutting image information under the same visual angle, and then cutting a region which does not contain the target object in the cutting image information by adopting a segmentation method, so that each piece of cutting image information only contains the preset region image information of the region where the target object is located.
The image information is divided into the preset areas through calculation, the areas except the preset areas are divided, corresponding preset area image information under different visual angles is obtained, the acquired image information is divided by the information except the target object, noise information carried by the image information is greatly reduced, the information amount of three-dimensional reproduction is greatly reduced, and the efficiency of three-dimensional reproduction is improved.
The method for acquiring the sound information of the target object comprises the following steps:
it should be noted that, in the present application, the sound recording device for collecting sound may be a sound recording device for recording only sound, or may be an image pickup device for collecting sound and image at the same time, which is not limited herein.
Acquiring sound information of a target object under multiple visual angles, wherein the visual angles required to be acquired by the target object comprise a visual angle 1, a visual angle 2, a visual angle 3, a visual angle 4, an original-original,
it should be noted that the number of sound information collected at the same viewing angle is not less than the number of image information, because it may be necessary to collect sound information between the audience and the sound source (performance area) at different positions in the image information at the next viewing angle.
After the acquisition of the image information and the sound information of the target object under a plurality of visual angles is completed, the mapping relationship among the visual angle, the image information and the sound information needs to be established according to the relationship among the visual angle, the image information and the sound information, in the application, the visual angle is taken as a basis, namely the mapping relationship between the image information and the sound information is established by taking the visual angle as an origin, for example, the visual angle 1 is taken as a basis, the image information and the sound information acquired by the target object under the visual angle 1 are acquired, the mapping relationship is established by the image information and the sound information acquired under the visual angle 1, the mapping relationship between the image information and the sound information of the target object under different visual angles is respectively established, and through the established mapping relationship, when any one of the visual angle, the image information and the sound information is input, the other two pieces of information corresponding to the input information can be quickly found and determined according to the mapping relationship, if the input visual angle is obtained, then the image information and the sound information corresponding to the input visual angle can be quickly determined according to the mapping relation among the visual angle, the sound information and the image information, and then the image information and the sound information under the visual angle are reproduced.
Optionally, the mapping relationship among the visual angle, the sound information, and the image information may be established in three ways: 1) the method comprises the steps that during collection, when image information and sound information are collected for a target object, the target object can be collected according to a visual angle, the sound information and the picture information are collected at the same visual angle, and the collection process is the establishment of a mapping relation; 2) later-stage endowing establishment can endow corresponding image information and sound information under different visual angles; 3) and establishing after calculation, calculating the image information and the sound information under a certain visual angle at a certain moment according to the currently obtained image information and the sound information, and then establishing a mapping relation between the image information and the sound information.
The method for reproducing the corresponding image information and sound information under the input visual angle of the target object in the application is as follows:
the method comprises the steps of respectively obtaining image information and sound information under different visual angles, respectively establishing attribute numbers, storing the image information and the sound information into preset positions of a storage unit according to the attribute numbers, setting a reproduction serial number by a reproduction unit, and finishing three-dimensional reproduction of a target object according to the preset positions and the reproduction serial number.
In the above scheme, after the acquisition of the image information and the sound information of the target object under different viewing angles is completed, the fusion reproduction of the sound and the image of the target object can be completed according to the acquired image information and sound information, and the reproduction modes include two types:
firstly, after the image information of the target object is acquired, the display end can directly reproduce according to the acquired image and sound information, the fusion reproduction can be understood as the fusion reproduction of the image information and the sound information of the target object, the image and sound reproduction of the three-dimensional object at different moments can be carried out in real time, other links are not passed in the middle, the acquired image information and sound information are not lost, real information is directly reproduced, and the real reproduction of the image and sound of the target object is realized.
Secondly, after image information and sound information of a target object are acquired, firstly, attribute numbers are established according to the acquired image information and sound information under different visual angles, the attribute numbers can be established according to visual angle information, azimuth information (such as longitude and latitude), time information and the like in the image information and the sound information, for example, the image information number under the visual angle 1 is 1-45-3, 001, abc, 1_1_0, 1-0 and the like, and the establishment rule of the attribute numbers is not limited as long as the establishment rule can represent the image information and the sound information under the current visual angle at the current moment; then storing the image information and the sound information with different attribute numbers into a preset position of a storage device, wherein a mapping relation is formed between the attribute numbers and the preset position, so that the storage is convenient, and the subsequent calling is convenient; and finally, setting reproduction serial numbers of different visual angles of the target object when the reproduction is realized, and then calling the image information and the sound information stored in the preset position to the reproduction position where the corresponding reproduction serial number is located, wherein a mapping relation is formed between the preset position and the reproduction serial number, so that the quick calling is convenient, and the fusion reproduction of the sound and the image of the target object is completed.
The storage device may be a collection device, or may be a backend server connected to the collection device, and the like, which is not limited herein.
Preferably, the image information includes: target object information, viewing angle information, orientation information (such as latitude and longitude), and time information.
The sound information includes: audio track, perspective information, orientation information (e.g., latitude and longitude), time information.
Therefore, according to the image and sound fusion reproduction method provided by the application, the image information and the sound information of the target object under a plurality of visual angles are acquired, the mapping relation between the image information and the sound information under different visual angles is established on the basis of the visual angles, the input visual angles are acquired, the image information and the sound information corresponding to the input visual angles are determined according to the established mapping relation, the image information and the sound information corresponding to the input visual angles are fused and reproduced, and the three-dimensional image information and the three-dimensional sound information of the target object under different visual angles can be truly presented.
Further, comprising:
acquiring input sound information, analyzing a visual angle corresponding to the input sound information, and reproducing image information of the visual angle corresponding to the sound information;
acquiring input image information, analyzing a visual angle corresponding to the image information, and reproducing sound information of the visual angle corresponding to the image information.
Further, the sound information includes at least one track:
reproducing image information corresponding to a track when the sound information includes the track;
when the sound information includes a plurality of tracks, an input track is acquired, image information corresponding to the input track is analyzed, and image information corresponding to the input track is reproduced.
In the above solution, the step S300 may also synchronously reproduce other non-input information according to the input sound information and image information, respectively, including the following aspects:
on the basis of the established mapping relation among the visual angle, the image information and the sound information, different information can be input according to the requirement.
When sound information is input, the visual angle corresponding to the input sound information can be directly acquired through the established mapping relation among the visual angle, the image information and the sound information, then the corresponding image information can be found according to the acquired visual angle, then the image information corresponding to the input sound information is reproduced, for example, in a concert, the input sound information is a triangular iron, the corresponding visual angle can be acquired by collecting the sound information of the triangular iron at that time, and then the image information of the corresponding triangular iron picture can be acquired according to the visual angle.
When image information is input, the visual angle corresponding to the input image information can be directly acquired through the established mapping relation among the visual angle, the image information and the sound information, then the corresponding sound information can be found according to the acquired visual angle, then the sound information corresponding to the input image information is reproduced, for example, in a movie theater, the input image information is a picture of bomb explosion, the corresponding visual angle can be acquired through an arc surface which collects bomb explosion at that time, and then the sound information corresponding to the picture of the corresponding bomb explosion is acquired according to the visual angle.
When collecting the sound information of the target object, the following two situations may occur:
when the sound information includes one audio track, the image information corresponding to the audio track is directly reproduced without switching the image, for example, when one audio track included in the sound information is a piano, the corresponding image is displayed and also includes the sound of the piano during collection, that is, only the related image during the audio track needs to be displayed.
When the sound information comprises a plurality of tracks, when the plurality of tracks are mutually switched, the corresponding pictures are also switched, if the plurality of tracks are respectively a pipe, a violin and a timpani, when the tracks are switched to the pipe, the image information corresponding to the pipe needs to be switched and reproduced, and when the tracks are switched to the timpani, the image information corresponding to the timpani needs to be switched and reproduced, so that the fusion reproduction of the image and the information of the target object is realized.
By the method, the image information and the sound information of the target object under different visual angles can be reproduced at any time, further, the sound information of the target object can change along with the change of the image information during reproduction, and for example, the sound information can change along with the change conditions of visual angles such as picture magnification and reduction, distance, rotation and translation and the like in the image information; meanwhile, the image information of the target object can also change along with the change of the sound information, for example, the image information can display the corresponding picture information according to the switching of the audio track in the sound information, the image information and the sound information of the target object under different visual angles can be truly reproduced, the real fusion between the sound information and the image information is realized, and the target object can be truly displayed.
Example two:
the second embodiment of the present application provides a system for fusion and reproduction of images and sound, further including the following modules:
an acquisition module: the system comprises a display unit, a display unit and a control unit, wherein the display unit is used for acquiring image information and sound information of a target object under multiple visual angles;
a mapping module: the mapping relation between the image information and the sound information under different visual angles is established on the basis of the visual angles;
a reproduction module: and the system is used for acquiring the input visual angle, determining the image information and the sound information corresponding to the input visual angle according to the mapping relation, and reproducing the image information and the sound information corresponding to the input visual angle.
Further, the mapping module is to:
and acquiring image information and sound information of the target object under the same visual angle, wherein the image information and the sound information form a mapping relation.
Further, the obtaining module is further configured to:
calculating the maximum boundary radius of the target object under the view angle, and taking the edge formed by the maximum boundary radius as the view angle boundary;
and acquiring image information of the target object in the view angle boundary under a plurality of view angles.
Further, the reproduction module is further configured to:
acquiring input sound information, analyzing a visual angle corresponding to the input sound information, and reproducing image information of the visual angle corresponding to the sound information;
acquiring input image information, analyzing a visual angle corresponding to the image information, and reproducing sound information of the visual angle corresponding to the image information.
Further, the reproduction module is further specifically configured to:
reproducing image information corresponding to a track when the sound information includes the track;
when the sound information includes a plurality of tracks, an input track is acquired, image information corresponding to the input track is analyzed, and image information corresponding to the input track is reproduced.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described system and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
Example three:
an embodiment of the present application provides an electronic device, including:
one or more processors;
a reservoir;
a screen for displaying images and sounds in the above method;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the methods as described above.
Referring to fig. 4, fig. 4 is a block diagram of an electronic device 1100 according to a third embodiment of the present disclosure. The electronic device 1100 in the present application may include one or more of the following components: memory 1110, processor 1120, screen 1130, and one or more applications, wherein the one or more applications may be stored in memory 1110 and configured to be executed by the one or more processors 1120, the one or more programs configured to perform the methods as described in the aforementioned method embodiments.
The Memory 1110 may include a Random Access Memory (RAM) or a Read-Only Memory (ROM). The memory 1110 may be used to store instructions, programs, code sets, or instruction sets. The memory 1110 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a histogram equalization function, etc.), instructions for implementing various method embodiments described below, and the like. The stored data area may also store data created during use by the electronic device 1100 (such as image matrix data, etc.).
Processor 1120 may include one or more processing cores. The processor 1120 interfaces with various parts throughout the electronic device 1100 using various interfaces and lines, and performs various functions of the electronic device 1100 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 1110 and calling data stored in the memory 1110. Alternatively, the processor 1120 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 1120 may integrate one or more of a Central Processing Unit (CPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, an application program and the like; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 1120, but may be implemented by a communication chip.
Example four:
a fourth embodiment of the present application provides a computer-readable storage medium,
the computer readable storage medium has stored therein program code that is invoked by a processor to perform the method described above.
Referring to fig. 5, fig. 5 is a block diagram illustrating a computer-readable storage medium according to a fourth embodiment of the present disclosure. The computer readable storage medium 1200 has stored therein a program code 1210, said program code 1210 being invokable by a processor for performing the method described in the above method embodiments.
The computer-readable storage medium 1200 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM (erasable programmable read only memory), a hard disk, or a ROM. Alternatively, the computer-readable storage medium 1200 includes a non-volatile computer-readable storage medium. The computer readable storage medium 1200 has storage space for program code 1210 that performs any of the method steps described above. The program code can be read from or written to one or more computer program products. The program code 1210 may be compressed, for example, in a suitable form.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (10)

1. A method for fusion reproduction of images and sound, comprising the steps of:
step S100: acquiring image information and sound information of a target object under multiple visual angles;
step S200: based on the visual angle, establishing a mapping relation between the image information and the sound information under different visual angles;
step S300: and acquiring an input visual angle, determining image information and sound information corresponding to the input visual angle according to the mapping relation, and reproducing the image information and the sound information corresponding to the input visual angle.
2. The method for fusion reproduction of image and sound according to claim 1, wherein the step S200 further comprises:
and collecting image information and sound information of the target object under the same visual angle, wherein a mapping relation is formed between the image information and the sound information.
3. The method for fusion reproduction of images and sounds according to claim 1, wherein the step S100 comprises:
calculating the maximum boundary radius of the target object under the view angle, and taking the edge formed by the maximum boundary radius as the view angle boundary;
and acquiring image information of the target object in the view angle boundary under a plurality of view angles.
4. The method for fusion reproduction of images and sounds according to claim 1 or claim 2, comprising:
acquiring input sound information, analyzing a visual angle corresponding to the input sound information, and reproducing image information of the visual angle corresponding to the sound information;
acquiring input image information, analyzing a visual angle corresponding to the image information, and reproducing sound information of the visual angle corresponding to the image information.
5. The method for fusion reproduction of images and sound according to claim 4, wherein the sound information includes at least one track:
reproducing image information corresponding to a track when the sound information includes the track;
when the sound information includes a plurality of tracks, an input track is acquired, image information corresponding to the input track is analyzed, and image information corresponding to the input track is reproduced.
6. A system for fusion reproduction of images and sound, comprising:
an acquisition module: the system comprises a display unit, a display unit and a control unit, wherein the display unit is used for acquiring image information and sound information of a target object under multiple visual angles;
a mapping module: the system is used for establishing a mapping relation between the image information and the sound information under different visual angles on the basis of the visual angles;
a reproduction module: and the system is used for acquiring the input visual angle, determining the image information and the sound information corresponding to the input visual angle according to the mapping relation, and reproducing the image information and the sound information corresponding to the input visual angle.
7. The system for fusion reproduction of images and sound according to claim 6, wherein the mapping module is configured to:
and acquiring image information and sound information of the target object under the same visual angle, wherein the image information and the sound information form a mapping relation.
8. The system for fusion reproduction of images and sound according to claim 6, wherein the obtaining module is further configured to:
calculating the maximum boundary radius of the target object under the view angle, and taking the edge formed by the maximum boundary radius as the view angle boundary;
and acquiring image information of the target object in the view angle boundary under a plurality of view angles.
9. An electronic device, comprising:
one or more processors;
a reservoir;
a screen for displaying the image and sound in the method according to any one of claims 1 to 5;
one or more applications, wherein the one or more applications are stored in the storage and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any of claims 1-5.
10. Computer-readable storage medium, characterized in that a program code is stored in the computer-readable storage medium, which program code can be called by a processor to execute the method according to any of the claims 1-5.
CN202211043746.1A 2022-08-30 2022-08-30 Fusion reproduction method, system, equipment and storage medium of image and sound Pending CN115134581A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211043746.1A CN115134581A (en) 2022-08-30 2022-08-30 Fusion reproduction method, system, equipment and storage medium of image and sound

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211043746.1A CN115134581A (en) 2022-08-30 2022-08-30 Fusion reproduction method, system, equipment and storage medium of image and sound

Publications (1)

Publication Number Publication Date
CN115134581A true CN115134581A (en) 2022-09-30

Family

ID=83387465

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211043746.1A Pending CN115134581A (en) 2022-08-30 2022-08-30 Fusion reproduction method, system, equipment and storage medium of image and sound

Country Status (1)

Country Link
CN (1) CN115134581A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101681088A (en) * 2007-05-29 2010-03-24 汤姆森许可贸易公司 Method of creating and reproducing a panoramic sound image, and apparatus for reproducing such an image
CN105389318A (en) * 2014-09-09 2016-03-09 联想(北京)有限公司 Information processing method and electronic equipment
TW201734948A (en) * 2016-03-03 2017-10-01 森翠根科技有限公司 A method, system and device for generating associated audio and visual signals in a wide angle image system
CN112351248A (en) * 2020-10-20 2021-02-09 杭州海康威视数字技术股份有限公司 Processing method for associating image data and sound data
CN113853529A (en) * 2019-05-20 2021-12-28 诺基亚技术有限公司 Apparatus, and associated method, for spatial audio capture
CN114648614A (en) * 2022-05-24 2022-06-21 四川中绳矩阵技术发展有限公司 Three-dimensional reproduction method and system of target object

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101681088A (en) * 2007-05-29 2010-03-24 汤姆森许可贸易公司 Method of creating and reproducing a panoramic sound image, and apparatus for reproducing such an image
CN105389318A (en) * 2014-09-09 2016-03-09 联想(北京)有限公司 Information processing method and electronic equipment
TW201734948A (en) * 2016-03-03 2017-10-01 森翠根科技有限公司 A method, system and device for generating associated audio and visual signals in a wide angle image system
CN113853529A (en) * 2019-05-20 2021-12-28 诺基亚技术有限公司 Apparatus, and associated method, for spatial audio capture
CN112351248A (en) * 2020-10-20 2021-02-09 杭州海康威视数字技术股份有限公司 Processing method for associating image data and sound data
CN114648614A (en) * 2022-05-24 2022-06-21 四川中绳矩阵技术发展有限公司 Three-dimensional reproduction method and system of target object

Similar Documents

Publication Publication Date Title
US10171769B2 (en) Sound source selection for aural interest
US20190139312A1 (en) An apparatus and associated methods
US10924875B2 (en) Augmented reality platform for navigable, immersive audio experience
US10278001B2 (en) Multiple listener cloud render with enhanced instant replay
US11109177B2 (en) Methods and systems for simulating acoustics of an extended reality world
JP2013093840A (en) Apparatus and method for generating stereoscopic data in portable terminal, and electronic device
TWI528794B (en) System and method for delivering media over network
US20190149941A1 (en) Particle-based spatial audio visualization
US20120317594A1 (en) Method and system for providing an improved audio experience for viewers of video
CN113256815B (en) Virtual reality scene fusion and playing method and virtual reality equipment
JP2001169309A (en) Information recording device and information reproducing device
CN106534968A (en) Method and system for playing 3D video in VR device
WO2019193244A1 (en) An apparatus, a method and a computer program for controlling playback of spatial audio
CN113676720A (en) Multimedia resource playing method and device, computer equipment and storage medium
CN112153472A (en) Method and device for generating special picture effect, storage medium and electronic equipment
US20220007078A1 (en) An apparatus and associated methods for presentation of comments
EP3503579B1 (en) Multi-camera device
CN115134581A (en) Fusion reproduction method, system, equipment and storage medium of image and sound
JP7054351B2 (en) System to play replay video of free viewpoint video
CA3044260A1 (en) Augmented reality platform for navigable, immersive audio experience
CN115225884A (en) Interactive reproduction method, system, device and medium for image and sound
EP3321795B1 (en) A method and associated apparatuses
KR20210056414A (en) System for controlling audio-enabled connected devices in mixed reality environments
CN106610982A (en) Media file generation method and apparatus
JP2019102940A (en) Virtual viewpoint content generation system, voice processing device, control method for virtual viewpoint content generation system, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination