CN107318054A - Audio-visual automated processing system and method - Google Patents
Audio-visual automated processing system and method Download PDFInfo
- Publication number
- CN107318054A CN107318054A CN201610266079.1A CN201610266079A CN107318054A CN 107318054 A CN107318054 A CN 107318054A CN 201610266079 A CN201610266079 A CN 201610266079A CN 107318054 A CN107318054 A CN 107318054A
- Authority
- CN
- China
- Prior art keywords
- video
- audio
- audio data
- target signature
- receiving
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8543—Content authoring using a description language, e.g. Multimedia and Hypermedia information coding Expert Group [MHEG], eXtensible Markup Language [XML]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8547—Content authoring involving timestamps for synchronizing content
Abstract
The present invention provides a kind of audio-visual automated processing system, runs in server, the server is connected with a transmitting terminal, and the system includes:Setup module, for determining to need the effect corresponding to each target signature and each target signature of detecting;Receiving module, for receiving video-audio data from the transmitting terminal;Detecting module, for when receiving the video-audio data, each target signature to be detected from the video-audio data;Processing module, is added in the video-audio data for when detecting target signature, obtaining the effect corresponding to the target signature, and by the effect.Present invention also offers a kind of audio-visual automated processing system.The processing that the video-audio data of dynamic load can be automated by the present invention.
Description
Technical field
The present invention relates to a kind of audio-visual automated processing system and a kind of audio-visual automatically process
Method.
Background technology
Existing video audio processing system is needed first by the shadow when to audio-visual handle
Sound file all loadeds, are then recognized set by user from the audio/video file of loading
Static object, then will be user-defined static according to the object of identification
Object is added in the audio/video file.This every time to handle audio/video file
When be required for the audio/video file loaded, and the object and the object of addition of identification
All it is static.
The content of the invention
In view of the foregoing, it is necessary to which a kind of audio-visual automated processing system and one are provided
Plant sound processing method, the place that can be automated to the video-audio data of dynamic load
Reason.
A kind of audio-visual automated processing system, runs in server, the server and one
Individual transmitting terminal is connected, and the system includes:Setup module, for determining to need what is detected
Effect corresponding to each target signature and each target signature;Receiving module, is used
In from the transmitting terminal receive video-audio data;Detecting module, for receive it is described
During video-audio data, each target signature is detected from the video-audio data;Processing module,
For when detecting target signature, obtaining the effect corresponding to the target signature, and
The effect is added in the video-audio data.
A kind of audio-visual automatic processing method, applied in server, the server and one
Individual transmitting terminal is connected, and this method includes:Setting steps, it is determined that needing each of detecting
Effect corresponding to target signature and each target signature;Receiving step, from described
Transmitting terminal receives video-audio data;Step is detected, when receiving the video-audio data,
Each target signature is detected from the video-audio data;Process step, when detecting mesh
When marking feature, the effect corresponding to the target signature is obtained, and the effect is added to
In the video-audio data.
Compared to prior art, audio-visual automated processing system of the invention can be from dynamic
The target signature of setting is detected in the video-audio data of loading, and target signature institute is right
The effect answered is added in the video-audio data.
Brief description of the drawings
Fig. 1 is the running environment signal of the embodiment of audio-visual automated processing system of the invention
Figure.
Fig. 2 is the functional block diagram of the embodiment of audio-visual automated processing system of the invention.
Fig. 3 is the flow chart of the embodiment of sound processing method of the present invention.
Main element symbol description
Following embodiment will further illustrate the present invention with reference to above-mentioned accompanying drawing.
Embodiment
As shown in fig.1, being the operation of the embodiment of audio-visual automated processing system of the invention
Environment schematic.The audio-visual automated processing system 10 is installed in the server 1.
The server 1 (is only drawn with a transmitting terminal 2 and at least one receiving terminal 3 in figure
One) communication connection.
The server 1 also includes, but not limited to first communication device 11, first deposited
Storage device 12 and first processor 13.The transmitting terminal 2 includes, but not limited to
Two communicators 21, the second storage device 22, second processor 23 and input unit
24.The receiving terminal 3 includes, but not limited to third communication device 31, the 3rd storage
Device 32, the 3rd processor 33 and playing device 34.
The server 1 is logical by described first with the transmitting terminal 2 and receiving terminal 3
T unit 11, secondary communication device 21 and third communication device 31 are communicated to connect.Institute
Stating first communication device 11, secondary communication device 21 and third communication device 31 can be with
Be wireless network card, GPRS module etc. can realize radio communication device or
Network interface card etc. can realize the device of wire communication.In the present embodiment, the server 1,
Transmitting terminal 2 and receiving terminal 3 pass through the first communication device 11, secondary communication device
21 and third communication device 31 be connected with internet communication, then via the interconnection
Network Communication is connected.
The first storage device 12, the second storage device 22 and the 3rd storage device
32 are installed on each in server 1, client 2 and receiving terminal 3 for storage respectively
The programmed instruction section and data information of program, it can be that the storage insides such as internal memory are set
Standby or smart media card (Smart Media Card), safe digital card
Deposited outside (Secure Digital Card), flash memory cards (Flash Card) etc.
Store up equipment.The first processor 13, the processor 33 of second processor 23 and the 3rd
Be respectively used to perform be installed on it is each in the server 1, client 2 and receiving terminal 3
The programmed instruction section of individual program and each device is controlled to perform corresponding operation.
The input unit 24 is used for the input behaviour for receiving the user of the transmitting terminal 2
Make.The input operation includes each target signature that setting needs to detect.It is described each
Effect corresponding to individual target signature can be acquiescence, can also be by the transmitting terminal 2
User set.That is, the input operation further comprises setting each
Effect corresponding to target signature.The input operation, which can also further comprise receiving, to be used
Family input video-audio data, the video-audio data can only include audio, only include video,
Audio and video can also be included.The input unit 24 can be touch screen, key
The input units such as disk, can further include microphone, image first-class voice and
Video input device.
The target signature can be default countenance (such as smiling face, face of crying,
Funny face etc.) or default action (such as raising one's hand, fall, wipe tears),
Can also be default voice (such as laugh, applause, call for help) or
Default object (such as cup, glasses, cap) etc..The audio-visual automatic place
Reason system 10 passes through the above-mentioned target signature of default process monitoring, the default program
Can be Facial expression recognition program, speech recognition program and article identification program
In one or more.
The corresponding effect can play default sound (such as laugh, cheer
Sound, the sound recorded in advance etc.) or play default picture or dynamic
Picture or video or the specific effect of addition (for example add in the face of people
Plus sunglasses, in the hand of people add loudspeaker etc.) or the effect above in two
Plant or a variety of combinations.The audio-visual automated processing system 10 will by corresponding program
The effect above is incorporated into the video-audio data.Corresponding program can be correspondence
In the Video Rendering program of different-effect.
The playing device 34 is used to play the video-audio data after the processing received, and it can
To be the audio-frequence player devices such as audio amplifier, loudspeaker, it can also further comprise that display screen etc. is regarded
Frequency playing device.
The transmitting terminal 2, which is used for use, will need video-audio data to be processed and need to send
To at least one list of receiving terminal 3 be sent to the server 1.The receiving terminal 3 is used
In from the video-audio data after the reception processing of server 1, and play the video-audio data.Institute
It can be mobile phone, tablet personal computer, Wearable etc. to state transmitting terminal 2 and receiving terminal 3
The equipment such as mobile device or notebook computer, PC.The service
Device 1, which is used to receive from transmitting terminal 2, to be needed video-audio data to be processed and needs what is be sent to
Receiving terminal 3, the video-audio data of reception is handled accordingly, and by the shadow after processing
Sound data are sent to the receiving terminal 3 specified.The server 1 is a meter for being located at distal end
Calculation machine or server or other equipment.
It should be noted that in certain embodiments, the transmitting terminal 2 can also be simultaneously
It is receiving terminal 3.That is, the transmitting terminal 2 will need video-audio data to be processed to send
To after the server 1, also from the video-audio data after the reception processing of server 1.
Now, the transmitting terminal 2 is also simultaneously a receiving terminal 3, i.e. transmitting terminal 2 and receiving terminal
3 are arranged in same device or equipment.
The audio-visual automated processing system 10 is used for from the reception processing of transmitting terminal 2
Video-audio data and the target signature for needing detecting, and receiving the video-audio data
When, detect the target signature from the video-audio data immediately, and by the target
Effect corresponding to feature is automatically added in the video-audio data.
As shown in fig.2, being the function of the embodiment of audio-visual automated processing system of the invention
Module map.The audio-visual automated processing system 10 can be divided into setup module 101,
Receiving module 102, detecting module 103 and processing module 104.Mould alleged by the present invention
Block is the series of computation machine program segment for referring to complete specific function, more suitable than program
Together in the implementation procedure for describing the audio-visual automated processing system 10, below with reference to Fig. 3
Flow chart the concrete function of modules described.
As shown in fig.3, being the flow of the embodiment of audio-visual automatic processing method of the invention
Figure.In the present embodiment, according to different demands, the step in flow chart shown in Fig. 3
Rapid execution sequence can change, and some steps can be omitted.
Step S31, setup module 101 determine need detect each target signature and
Effect corresponding to each target signature.
In the present embodiment, described each target signature for needing to detect and each mesh
Effect corresponding to mark feature is set by the user of the transmitting terminal 2.
That is, described transmitting terminal 2 is detected the need for being received by input unit 24 set by user
Target signature and each target signature corresponding to effect, then pass through described
Target signature and each target signature institute that two communicators 21 detect the needs
Corresponding effect is sent to the server 1.Specifically, the server 1 can be by
All target signatures that can be detected and all effects that can be realized are sent to described
Transmitting terminal 2, each target signature detected is needed so as to user's selection of transmitting terminal 2
And the effect corresponding to each target signature.For example, one target signature of setting is
Certain is in short uttered, and sets an effect to play the one section of animation specified.
In another embodiment, described each target signature for needing to detect can be by institute
The user for stating transmitting terminal 2 is set, the effect corresponding to each described target signature
It can be the effect of acquiescence.Make that is, the transmitting terminal 2 is received by input unit 24
The target signature detected the need for user is set, is then filled by the described second communication
Put 21 and the target signature for needing to detect is sent to the server 1.Specifically,
The server 1 can be special by all target signatures that can be detected and each target
Levy corresponding effect and be sent to the transmitting terminal 2, selected so as to the user of transmitting terminal 2
Select each target signature for needing to detect.
In another embodiment, it is described need detect each target signature and each
Effect corresponding to target signature is acquiescence.Namely, it is necessary to each mesh of detecting
Effect corresponding to mark feature and each target signature has been set.
Step S32, receiving module 102 is received from the transmitting terminal 2 need to be to be processed audio-visual
One or more receiving terminals 3 that data and needs are sent to.The video-audio data can be with
It is a voice document (such as one section recording) or a video file (example
Such as one section of video for recording) or a voice flow (for example put through
Phone) an or video flowing (video for example recorded).
It should be noted that in the present embodiment, transmitting terminal 2 by the video-audio data with
The form of file stream is sent.The receiving module 102 is being received transmitted by transmitting terminal 2
File stream when, be just immediately performed step S33, at the same the receiving module 102 still after
The video-audio data is received in continued access, without performing step again having received the video-audio data
Rapid S33.
Step S33, detecting module 103 when receiving the video-audio data, immediately from
Each target signature is detected in the video-audio data.The acquisition module 102 is receiving
When stating video-audio data, just detected immediately by default program in the video-audio data
The target signature set by the user of the transmitting terminal 2 whether is included, and which is included
A little target signatures.And the acquisition module 103 is when detecting a target signature, just
Step S34 is immediately performed, while the acquisition module 103 continues to the shadow that detecting is received
Whether also include other target signatures in sound data.The default program can be face
One kind in portion's Expression Recognition program, speech recognition program and article identification program
Or it is a variety of.
Step S34, processing module 104 is detecting target spy from the video-audio data
When levying, the effect corresponding to the target signature is obtained, and the effect is added to described
In video-audio data.The corresponding effect can send default sound (for example to laugh at
Sound, cheer, the sound recorded in advance etc.) or the default picture of broadcasting,
Or animation or video or specific effect is added (such as people's
Face adds sunglasses, in hand addition loudspeaker of people etc.) or the effect above
In the combination of two or more.The processing module 104 is by corresponding program by institute
The effect corresponding to target signature is stated to be incorporated into the video-audio data.It is corresponding
Program can correspond to the Video Rendering program of the effect.
Step S35, the video-audio data after processing is sent to by processing module 104 needs hair
The one or more receiving terminals 3 being sent to.Shadow of the receiving terminal 3 after processing is received
During sound data, the video-audio data received is played by the playing device 34.
For example, if user a has setting sound " Hanabi " correspondence in systems
Effect be " picture produce fireworks sound and light program ", when user a and user b is being carried out
During video calling, user a sends invitation to user b, and expression is wished to go to see together
Fireworks, to persuade user b to lift interest, user a can send sound " Hanabi ",
The audio-visual automated processing system 10 can detect the sound that the user a is sent
" Hanabi " (i.e. target signature), and the sound detected according to this draws currently
Face addition fireworks sound and light program (i.e. corresponding effect).
It should be noted last that, technology of the above example only to illustrate the present invention
Scheme and it is unrestricted, it will be understood by those within the art that, can be to this hair
Bright technical scheme is modified or equivalent substitution, without departing from technical solution of the present invention
Spirit and scope.
Claims (12)
1. a kind of audio-visual automated processing system, runs in server, the server with
One transmitting terminal is connected, it is characterised in that the system includes:
Setup module, for each target signature and each mesh for determining to need to detect
Mark the effect corresponding to feature;
Receiving module, for receiving video-audio data from the transmitting terminal;
Detecting module, for when receiving the video-audio data, from the audio-visual number
According to middle each target signature of detecting;
Processing module, for when detecting target signature, obtaining the target signature institute
Corresponding effect, and the effect is added in the video-audio data.
2. audio-visual automated processing system as claimed in claim 1, it is characterised in that
The detecting module is when receiving the video-audio data, immediately from the video-audio data
Middle each target signature of detecting.
3. audio-visual automated processing system as claimed in claim 1, it is characterised in that
The detecting module is detectd when having received the video-audio data from the video-audio data
Survey each target signature.
4. audio-visual automated processing system as claimed in claim 1, it is characterised in that
The receiving module also received from the transmitting terminal need it is being sent to the server
Connected one or more receiving terminals;And the processing module is additionally operable to after processing
Video-audio data is sent to one or more of receiving terminals.
5. the audio-visual automated processing system as described in any one of Claims 1-4, its
It is characterised by, the target signature is default countenance, default action, pre-
If voice and at least one of default object.
6. the audio-visual automated processing system as described in any one of Claims 1-4, its
It is characterised by, the corresponding effect is to play default sound, play default figure
One or more in piece or animation or video and the specific effect of addition.
7. a kind of audio-visual automatic processing method, applied in server, the server with
One transmitting terminal is connected, it is characterised in that this method includes:
Setting steps, it is determined that needing each target signature and each target for detecting special
Levy corresponding effect;
Receiving step, video-audio data is received from the transmitting terminal;
Step is detected, when receiving the video-audio data, from the video-audio data
Detect each target signature;
Process step, when detecting target signature, is obtained corresponding to the target signature
Effect, and the effect is added in the video-audio data.
8. audio-visual automatic processing method as claimed in claim 7, it is characterised in that
The detecting module is when receiving the video-audio data, immediately from the video-audio data
Middle each target signature of detecting.
9. audio-visual automatic processing method as claimed in claim 7, it is characterised in that
The detecting module is detectd when having received the video-audio data from the video-audio data
Survey each target signature.
10. audio-visual automatic processing method as claimed in claim 7, it is characterised in that
The receiving step also received from the transmitting terminal need it is being sent to the server
Connected one or more receiving terminals;And the processing module is additionally operable to after processing
Video-audio data is sent to one or more of receiving terminals.
11. the audio-visual automatic processing method as described in any one of claim 7 to 10, its
It is characterised by, the target signature is default countenance, default action, pre-
If voice and at least one of default object.
12. the audio-visual automatic processing method as described in any one of claim 7 to 10,
Characterized in that, the corresponding effect is to play default sound, play default
One or more in picture or animation or video and the specific effect of addition.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610266079.1A CN107318054A (en) | 2016-04-26 | 2016-04-26 | Audio-visual automated processing system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610266079.1A CN107318054A (en) | 2016-04-26 | 2016-04-26 | Audio-visual automated processing system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107318054A true CN107318054A (en) | 2017-11-03 |
Family
ID=60184462
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610266079.1A Pending CN107318054A (en) | 2016-04-26 | 2016-04-26 | Audio-visual automated processing system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107318054A (en) |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1336288A (en) * | 2000-07-31 | 2002-02-20 | 赵晓峰 | Direct image printing method on metal foil |
CN1532775A (en) * | 2003-03-19 | 2004-09-29 | ���µ�����ҵ��ʽ���� | Visuable telephone terminal |
CN102455898A (en) * | 2010-10-29 | 2012-05-16 | 张明 | Cartoon expression based auxiliary entertainment system for video chatting |
CN104252226A (en) * | 2013-06-28 | 2014-12-31 | 联想(北京)有限公司 | Information processing method and electronic equipment |
CN104703043A (en) * | 2015-03-26 | 2015-06-10 | 努比亚技术有限公司 | Video special effect adding method and device |
CN104780339A (en) * | 2015-04-16 | 2015-07-15 | 美国掌赢信息科技有限公司 | Method and electronic equipment for loading expression effect animation in instant video |
CN104780459A (en) * | 2015-04-16 | 2015-07-15 | 美国掌赢信息科技有限公司 | Method and electronic equipment for loading effects in instant video |
CN104780458A (en) * | 2015-04-16 | 2015-07-15 | 美国掌赢信息科技有限公司 | Method and electronic equipment for loading effects in instant video |
CN104917994A (en) * | 2015-06-02 | 2015-09-16 | 烽火通信科技股份有限公司 | Audio and video calling system and method |
CN105049911A (en) * | 2015-07-10 | 2015-11-11 | 西安理工大学 | Video special effect processing method based on face identification |
US20160086368A1 (en) * | 2013-03-27 | 2016-03-24 | Nokia Technologies Oy | Image Point of Interest Analyser with Animation Generator |
CN105468142A (en) * | 2015-11-16 | 2016-04-06 | 上海璟世数字科技有限公司 | Interaction method and system based on augmented reality technique, and terminal |
-
2016
- 2016-04-26 CN CN201610266079.1A patent/CN107318054A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1336288A (en) * | 2000-07-31 | 2002-02-20 | 赵晓峰 | Direct image printing method on metal foil |
CN1532775A (en) * | 2003-03-19 | 2004-09-29 | ���µ�����ҵ��ʽ���� | Visuable telephone terminal |
CN102455898A (en) * | 2010-10-29 | 2012-05-16 | 张明 | Cartoon expression based auxiliary entertainment system for video chatting |
US20160086368A1 (en) * | 2013-03-27 | 2016-03-24 | Nokia Technologies Oy | Image Point of Interest Analyser with Animation Generator |
CN104252226A (en) * | 2013-06-28 | 2014-12-31 | 联想(北京)有限公司 | Information processing method and electronic equipment |
CN104703043A (en) * | 2015-03-26 | 2015-06-10 | 努比亚技术有限公司 | Video special effect adding method and device |
CN104780339A (en) * | 2015-04-16 | 2015-07-15 | 美国掌赢信息科技有限公司 | Method and electronic equipment for loading expression effect animation in instant video |
CN104780459A (en) * | 2015-04-16 | 2015-07-15 | 美国掌赢信息科技有限公司 | Method and electronic equipment for loading effects in instant video |
CN104780458A (en) * | 2015-04-16 | 2015-07-15 | 美国掌赢信息科技有限公司 | Method and electronic equipment for loading effects in instant video |
CN104917994A (en) * | 2015-06-02 | 2015-09-16 | 烽火通信科技股份有限公司 | Audio and video calling system and method |
CN105049911A (en) * | 2015-07-10 | 2015-11-11 | 西安理工大学 | Video special effect processing method based on face identification |
CN105468142A (en) * | 2015-11-16 | 2016-04-06 | 上海璟世数字科技有限公司 | Interaction method and system based on augmented reality technique, and terminal |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105100366B (en) | Harassing call number determines methods, devices and systems | |
CN109857352A (en) | Cartoon display method and human-computer interaction device | |
CN108805091A (en) | Method and apparatus for generating model | |
CN109919244B (en) | Method and apparatus for generating a scene recognition model | |
CN109086719A (en) | Method and apparatus for output data | |
US20140006550A1 (en) | System for adaptive delivery of context-based media | |
CN109977839A (en) | Information processing method and device | |
CN108924381B (en) | Image processing method, image processing apparatus, and computer readable medium | |
CN109871834A (en) | Information processing method and device | |
CN108509611A (en) | Method and apparatus for pushed information | |
CN112420069A (en) | Voice processing method, device, machine readable medium and equipment | |
CN109934191A (en) | Information processing method and device | |
CN110827824B (en) | Voice processing method, device, storage medium and electronic equipment | |
CN110502665A (en) | Method for processing video frequency and device | |
CN113033677A (en) | Video classification method and device, electronic equipment and storage medium | |
CN112381074B (en) | Image recognition method and device, electronic equipment and computer readable medium | |
CN109949793A (en) | Method and apparatus for output information | |
CN107195314B (en) | The method for recording and device of audio data | |
CN113220752A (en) | Display method and device and electronic equipment | |
US20150112997A1 (en) | Method for content control and electronic device thereof | |
CN110619602B (en) | Image generation method and device, electronic equipment and storage medium | |
CN113628097A (en) | Image special effect configuration method, image recognition method, image special effect configuration device and electronic equipment | |
CN107944024B (en) | Method and device for determining audio file | |
CN108062405B (en) | Picture classification method and device, storage medium and electronic equipment | |
CN107318054A (en) | Audio-visual automated processing system and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20171103 |