CN110493640A - A kind of system and method that the Video Quality Metric based on video processing is PPT - Google Patents

A kind of system and method that the Video Quality Metric based on video processing is PPT Download PDF

Info

Publication number
CN110493640A
CN110493640A CN201910706271.1A CN201910706271A CN110493640A CN 110493640 A CN110493640 A CN 110493640A CN 201910706271 A CN201910706271 A CN 201910706271A CN 110493640 A CN110493640 A CN 110493640A
Authority
CN
China
Prior art keywords
processing
video
audio
picture
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910706271.1A
Other languages
Chinese (zh)
Inventor
敖欣
朱泓谕
吴永满
黄鑫杰
陈钿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongguan University of Technology
Original Assignee
Dongguan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongguan University of Technology filed Critical Dongguan University of Technology
Priority to CN201910706271.1A priority Critical patent/CN110493640A/en
Publication of CN110493640A publication Critical patent/CN110493640A/en
Pending legal-status Critical Current

Links

Classifications

    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440236Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by media transcoding, e.g. video is transformed into a slideshow of still pictures, audio is converted into text
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses a kind of system and methods that the Video Quality Metric based on video processing is PPT, the system comprises have data storage server, using processing server and WEB server, wherein, the application processing server has pattern process module, audio processing modules and document integrate module, the data storage server isolates voice flow and video flowing, voice flow will be isolated and video flowing is transferred to pattern process module using processing server respectively, audio processing modules, the processing of great amount of images data is carried out in pattern process module, the processing of audio is carried out in audio processing modules, and it is converted into text;Then picture and lteral data are exported integrate module to document respectively by pattern process module, audio processing modules, and document integrates module and matches picture and text, and export a whole set of complete PPT document to WEB server.

Description

A kind of system and method that the Video Quality Metric based on video processing is PPT
Technical field
The present invention relates to information technology fields and field of Educational Technology, particularly relate to a kind of video turn based on video processing It is changed to the system and method for PPT.
Background technique
With the growth of the value of knowledge of modern society, useful information is likely to just written in water, such as opens in leader When meeting, class-teaching of teacher, when public speaking, speaker says quickly, causes us the time not remember so multi information, for this Class problem, currently used method are exactly to save information by hand-written notes, shooting explanation video etc..Both above-mentioned solutions Method all brings some problems, for hand-written notes: notes it is incomplete, causing cannot be accurately by useful letter Breath preserves, and looks back scene when having no idea to reappear explanation when notes;And for shooting explanation video: video It is likely to grow very much, dozens of minutes to several hours etc., this is one and no small chooses when looking back information for us War, and the voice recorded may be smudgy, and sounding can be gruelling.
Patent application 201710179528.3 discloses a kind of video, handout PPT and voice content precisely matched method And system, the method shoot with video-corder teachers video by video camera, and the record installed on the computer of PPT is played by class-teaching of teacher Screen software records computer video and teachers video and computer video is merged processing with course entitled index;According to figure Video segmentation is several video-frequency bands by picture variation, and the identical video-frequency band of text in video is merged, the time of video segmentation is recorded Value;It extracts the voice messaging in teachers video and is converted to text, record the time value of every language;With course name and when Between value for index, establish the data correlation between video, image, voice and content.The present invention is applied in field of Educational Technology, can When realizing online broadcasting teaching resource, user can position oneself interested PPT handout content or classroom by search service Voice content, and the instructional video of program request relevant time period can be positioned at any time, play the PPT handout content of related pages.
However, above-mentioned method is only to record to specific PPT, it cannot achieve video and be converted into PPT.To sum up, Existing solution can all lead to huge waste of the information in transmission process and increase the difficulty of understanding.
Summary of the invention
Shortcoming present in view of the above technology, the present invention provide it is a kind of based on video processing Video Quality Metric be The system of PPT, the video flowing which has PPT to explain by a shooting, can be believed huge video using this system A whole set of complete PPT document of boil down to is ceased, user can quickly extract the part of oneself needs from bulk information, and And the speech of the speaker of remarks can deepen understanding of the user to problem, the efficiency for improving study, looking back.
It is a further object to provide a kind of system that the Video Quality Metric based on video processing is PPT, the systems It is convenient to realize, it is only necessary to which the equipment that can be shot can be realized, and easy to operate, it is only necessary to video is passed to this system, just It can be converted into complete PPT document, be suitable for a variety of places.
To achieve the above object, the invention is realized in this way.
A kind of system that the Video Quality Metric based on video processing is PPT, it is characterised in that the system comprises have data to deposit Store up server, using processing server and WEB server, wherein the application processing server has pattern process module, sound Frequency processing module and document integrate module, and the data storage server isolates voice flow and video flowing, will isolate voice Stream is transferred to the pattern process module using processing server, audio processing modules with video flowing respectively, in pattern process module The processing of great amount of images data is carried out, carries out the processing of audio in audio processing modules, and be converted into text;Then graphics process Picture and lteral data are exported integrate module to document respectively by module, audio processing modules, document integrate module by picture and Text is matched, and exports a whole set of complete PPT document to WEB server.
A method of the Video Quality Metric based on video processing is PPT, and this method is isolated by data storage server Voice flow and video flowing, then will isolate voice flow and video flowing is transferred to graphics process mould using processing server respectively Block, audio processing modules carry out the processing of image data in pattern process module, convert picture, audio processing mould for video The processing of audio is carried out in block, and is converted into text;Then pattern process module, audio processing modules are respectively by picture and text Data, which are exported, integrates module to document, and document integrates module and matches picture and text, and it is complete to export a whole set of PPT document is to WEB server.+
Further, the method receives the request for the uploaded videos that client is sent, video first with data storage server Stream is passed to data storage server first and carries out data backup, and data processing is carried out in data storage server, is isolated Voice flow and video flowing, and be N frame picture by Video segmentation (N is the positive integer more than or equal to 1);Again by image data and voice Data are incoming to apply processing server, carries out image procossing and speech processes in application processing server.
Further, in application processing server: giving a kind of processing great amount of images number in pattern process module According to method, this method may have inclination along x, tri- axis of y, z in view of PPT view field, need to become by perspective Changing commanders, it is inverted to parallel position.Specifically: grayscale image is converted by each frame image first and is filtered noise reduction process, then Switch to two-value picture, PPT view field is rectangular area, and is highlight regions compared to ambient enviroment, is then calculated using Canny Method carries out edge detection and extracts profile, objective contour is chosen by the method for extracting the maximum profile of area, then with more The method of side shape fitting surrounds profile, then obtain rectangular corners coordinate by finding the convex closure of profile, then to obtaining four The coordinate of a point is ranked up, and separates upper left, upper right, bottom right, lower-left, is coordinately transformed to obtain most finally by transformation matrix The image wanted eventually, that is, the position projected, and be cut into, then as the defeated of image processing model Enter.
Further, in image processing model, Gauss gold word (is first used using gaussian filtering+down-sampled operation Tower), then detect characteristic point and extract feature vector, for two pictures, if feature vector is more similar, just represent this two Picture is more similar, and more similar picture temporarily only retains one, remaining similar pictures is stored in set, after reduction The operand in face is finally before recognition restored picture by up-sampling operation (using laplacian pyramid); Then the text in remaining each frame picture is identified with the convolutional neural networks model that training is completed with image, and remembered The coordinate for recording word and image in picture, is finally reconfigured each element by coordinate on the new PPT of one page.
Further, the step of a kind of processing audio is given in audio processing modules, it may be assumed that progress VAD detection first, Voice and environmental noise are classified using GM model, audio is subjected to noise reduction process, then by based on artificial neural network Hybrid algorithm identifies audio, and is converted into text.
Further, it is integrated in document and provides the algorithm of a kind of voice and video matching in module, it is specific as follows: in figure In image collection obtained in processing module, this is found out by calculating the time interval of each image collection, then from audio The text that time interval is converted to thus may be implemented accurately to match the voice of speaker into the remarks section at PPT pages, and And a whole set of complete PPT document is exported to WEB server.
Further, in pattern process module, for the picture addition time label being converted into, meanwhile, for sound Frequency does the text being converted into and also carries out time label, in order to be matched with the text that audio is converted into.
The method for the video flow processing that the present invention is realized: it in terms of image, can effectively be captured by target positioning Projection information on classroom or meeting, it is bright occurring in video using continuous frame difference method under the premise of carrying out video denoising The frame image of aobvious frame variation obtains, avoids the motion image blurring generated between frame in video, loss of detail and again Shadow problem, while improving the image quality of video;Substantially increase the transformation efficiency and clarity of video.
Compared with prior art, the beneficial effects of the present invention are:
The mode in meeting, classroom recorded information on the market is to carry out hand-written notes or directly regard under record mostly now Frequently, and these modes can not only have inconvenience, also result in the loss of precious information, the present invention, which proposes one to this, to be had The solution of effect, the present invention only need user that the video that shooting has PPT to explain is passed to this system, and system will utilize each skill PPT in video is extracted and is attached to text made of the speech conversion of speaker by the processing of art module.
In structure, product on the market is inputted by mobile phone terminal mostly, and the present invention supports multi-platform input, It is also more diversified in core technology as long as the platform that can connect internet is ok, it has been divided at different logics Server is managed, the processing of multimode is realized, avoids mixing in logic.
Detailed description of the invention
Fig. 1 is the structural schematic diagram for the system that the present invention is realized.
Specific embodiment
In order to more clearly state the present invention, the present invention is further described with reference to the accompanying drawing.
Refering to Figure 1, as shown in the figure, present invention institute is real shown in the system structure diagram realized for the present invention The system that the existing Video Quality Metric based on video processing is PPT, include data storage server, using processing server and WEB server, wherein there is the application processing server pattern process module, audio processing modules and document to integrate module, The data storage server isolates voice flow and video flowing, will isolate voice flow and video flowing is transferred to respectively at Pattern process module, the audio processing modules of server are managed, the processing of great amount of images data, audio are carried out in pattern process module The processing of audio is carried out in processing module, and is converted into text;Then pattern process module, audio processing modules are respectively by picture It is exported with lteral data and integrates module to document, document integrates module and matches picture and text, and exports a whole set of Complete PPT document is to WEB server.
The method that the Video Quality Metric that the present invention is realized as a result, is PPT are as follows:
S1 server receives the request for the uploaded videos that client is sent, and video flowing is passed to S1 first and carries out data backup, In Data processing is carried out in S1, isolates voice flow and video flowing, and is that (N is just more than or equal to 1 to N frame picture by Video segmentation Integer);Image data and voice data are passed to S2 application processing server again, carried out at image procossing and voice in S2 Reason.
Treatment process in S2 is specific as follows: a kind of processing great amount of images data are given in pattern process module S2.1 Method, it is contemplated that PPT view field may have inclination along x, tri- axis of y, z, need by perspective transform that it is anti- Go to parallel position.Grayscale image is converted by each frame image first and is filtered noise reduction process, then turns to two-value picture, PPT view field is rectangular area, and is highlight regions compared to ambient enviroment, so carrying out edge inspection using Canny algorithm Profile is surveyed and extracted, objective contour is chosen by the method for extracting the maximum profile of area, then uses the side of polygon approach Method surrounds profile, obtains rectangular corners coordinate followed by the convex closure for finding profile, first to obtain the coordinates of four points into Row sequence, separates upper left, upper right, bottom right, lower-left, is coordinately transformed finally by transformation matrix and finally to be wanted Image, that is, the position projected, and be cut into, then as the input of image processing model.
In image processing model, first using gaussian filtering+down-sampled operation (using gaussian pyramid), then spy is detected Sign point simultaneously extracts feature vector, for two pictures, if feature vector is more similar, just represents this two picture and gets over phase Seemingly, more similar picture temporarily only retains one, remaining similar pictures is stored in set, subsequent operation is reduced Amount is finally before recognition restored picture by up-sampling operation (using laplacian pyramid).Then with instruction Practice the convolutional neural networks model of completion to identify to the text in remaining each frame picture with image, and in recordable picture The coordinate of word and image is finally reconfigured each element by coordinate on the new PPT of one page;In audio processing modules A kind of the step of processing audio is given in S2.2, first progress VAD detection, uses GM model by voice and environment herein Audio is carried out noise reduction process, then is identified by the hybrid algorithm based on artificial neural network to audio by noise classification, and It is converted into text, the algorithm for providing a kind of voice and video matching in module S2.3 is integrated in document, it is specific as follows: in S2.1 In obtained image collection, by calculating the time interval of each image collection, then this time section is found out from audio and is turned The text dissolved thus may be implemented accurately to match the voice of speaker into the remarks section at PPT pages, and it is whole to export one Complete PPT document is covered to WEB server S3.
In short, the invention has the advantages that
Give it is a kind of take notes in modern society, review the solution looked back and had any problem, have PPT explanation by a shooting Video flowing, use this system can be by huge video information compression for a whole set of complete PPT document, user can be with The part of oneself needs is quickly extracted from bulk information, and the speech of the speaker of remarks can deepen user to asking The understanding of topic, the efficiency for improving study, looking back.
Operation of the present invention is simple, it is only necessary to which video, which is passed to this system, can be converted into complete PPT document.
Strong applicability of the present invention is suitable for a variety of places, limits without occasion, and only need the equipment that can be shot It is ok.
The present invention can tentatively solve the problems, such as identification one page PPT template.
Disclosed above is only several specific embodiments of the invention, but the present invention is not limited to this, any ability What the technical staff in domain can think variation should all fall into protection scope of the present invention.

Claims (8)

1. a kind of system that the Video Quality Metric based on video processing is PPT, it is characterised in that the system comprises have data storage Server, using processing server and WEB server, wherein the application processing server have pattern process module, audio Processing module and document integrate module, and the data storage server isolates voice flow and video flowing, will isolate voice flow Be transferred to the pattern process module using processing server, audio processing modules respectively with video flowing, in pattern process module into The processing of row great amount of images data, the processing of audio is carried out in audio processing modules, and is converted into text;Then graphics process mould Picture and lteral data are exported integrate module to document respectively by block, audio processing modules, and document integrates module for picture and text Word is matched, and exports a whole set of complete PPT document to WEB server.
2. a kind of method that the Video Quality Metric based on video processing is PPT, this method isolate language by data storage server Sound stream and video flowing, then will isolate voice flow and video flowing is transferred to graphics process mould using processing server respectively Block, audio processing modules carry out the processing of image data in pattern process module, convert picture, audio processing mould for video The processing of audio is carried out in block, and is converted into text;Then pattern process module, audio processing modules are respectively by picture and text Data, which are exported, integrates module to document, and document integrates module and matches picture and text, and it is complete to export a whole set of PPT document is to WEB server.
3. method according to claim 2, it is characterised in that the method receives client first with data storage server The request of the uploaded videos sent is held, video flowing is passed to data storage server first and carries out data backup, stores in data Data processing is carried out in server, isolates voice flow and video flowing, and is N frame picture by Video segmentation, and N is more than or equal to 1 Positive integer;It is again that image data and voice data is incoming using processing server, image is carried out in application processing server Processing and speech processes.
4. method as claimed in claim 3, it is characterised in that in application processing server: first converting each frame image It is filtered noise reduction process for grayscale image, then turns to two-value picture, PPT view field is rectangular area, and compared to surrounding Environment is highlight regions, then carries out edge detection using Canny algorithm and extracts profile, by extracting the maximum wheel of area Wide method chooses objective contour, is then surrounded profile with the method for polygon approach, then the convex closure by finding profile Rectangular corners coordinate is obtained, then the coordinate for obtaining four points is ranked up, separates upper left, upper right, bottom right, lower-left, finally It is coordinately transformed the image finally wanted by transformation matrix, that is, the position projected, and is cut into Come, then as the input of image processing model.
5. method as claimed in claim 4, it is characterised in that in image processing model, first using gaussian filtering+down-sampled Operation, then detect characteristic point and extract feature vector, for two pictures, if feature vector is more similar, just represent this Two pictures are more similar, and more similar picture temporarily only retains one, remaining similar pictures is stored in set, reduce Subsequent operand is finally before recognition restored picture by using laplacian pyramid;Then it uses and has trained At convolutional neural networks model the text in remaining each frame picture identified with image, and in recordable picture word and The coordinate of image is finally reconfigured each element by coordinate on the new PPT of one page.
6. method according to claim 2, it is characterised in that in audio processing modules: progress VAD detection first, using GM Model classifies voice and environmental noise, audio is carried out noise reduction process, then calculate by the mixing based on artificial neural network Method identifies audio, and is converted into text.
7. method according to claim 2, it is characterised in that integrated in module in document: obtained in pattern process module In image collection, find out what this time section was converted to by calculating the time interval of each image collection, then from audio Text thus may be implemented accurately to match the voice of speaker into the remarks section at PPT pages, and it is complete to export a whole set of PPT document to WEB server.
8. method according to claim 2, it is characterised in that in pattern process module, added for the picture being converted into Time label, meanwhile, the text being converted into is done for audio and also carries out time label, in order to the text that is converted into audio into Row matching.
CN201910706271.1A 2019-08-01 2019-08-01 A kind of system and method that the Video Quality Metric based on video processing is PPT Pending CN110493640A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910706271.1A CN110493640A (en) 2019-08-01 2019-08-01 A kind of system and method that the Video Quality Metric based on video processing is PPT

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910706271.1A CN110493640A (en) 2019-08-01 2019-08-01 A kind of system and method that the Video Quality Metric based on video processing is PPT

Publications (1)

Publication Number Publication Date
CN110493640A true CN110493640A (en) 2019-11-22

Family

ID=68549096

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910706271.1A Pending CN110493640A (en) 2019-08-01 2019-08-01 A kind of system and method that the Video Quality Metric based on video processing is PPT

Country Status (1)

Country Link
CN (1) CN110493640A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111741359A (en) * 2020-05-28 2020-10-02 杨伟 Method and system for converting video into PPTX
CN112203036A (en) * 2020-09-14 2021-01-08 北京神州泰岳智能数据技术有限公司 Method and device for generating text document based on video content
WO2021114824A1 (en) * 2020-06-28 2021-06-17 平安科技(深圳)有限公司 Presentation generation method, apparatus, and device, and medium
CN113779345A (en) * 2021-09-06 2021-12-10 北京量子之歌科技有限公司 Teaching material generation method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20030088546A (en) * 2002-05-11 2003-11-20 김대성 Multi-media system of making VOD (Video On Demand) using Word-processor(Powerpoint and etc.) with Voice and Music, which can be shown through internet
CN102799859A (en) * 2012-06-20 2012-11-28 北京交通大学 Method for identifying traffic sign
CN109309790A (en) * 2018-11-02 2019-02-05 长春市长光芯忆科技有限公司 A kind of meeting lantern slide intelligent recording method and system
CN109492206A (en) * 2018-10-10 2019-03-19 深圳市容会科技有限公司 PPT presentation file method for recording, device, computer equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20030088546A (en) * 2002-05-11 2003-11-20 김대성 Multi-media system of making VOD (Video On Demand) using Word-processor(Powerpoint and etc.) with Voice and Music, which can be shown through internet
CN102799859A (en) * 2012-06-20 2012-11-28 北京交通大学 Method for identifying traffic sign
CN109492206A (en) * 2018-10-10 2019-03-19 深圳市容会科技有限公司 PPT presentation file method for recording, device, computer equipment and storage medium
CN109309790A (en) * 2018-11-02 2019-02-05 长春市长光芯忆科技有限公司 A kind of meeting lantern slide intelligent recording method and system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111741359A (en) * 2020-05-28 2020-10-02 杨伟 Method and system for converting video into PPTX
WO2021114824A1 (en) * 2020-06-28 2021-06-17 平安科技(深圳)有限公司 Presentation generation method, apparatus, and device, and medium
CN112203036A (en) * 2020-09-14 2021-01-08 北京神州泰岳智能数据技术有限公司 Method and device for generating text document based on video content
CN112203036B (en) * 2020-09-14 2023-05-26 北京神州泰岳智能数据技术有限公司 Method and device for generating text document based on video content
CN113779345A (en) * 2021-09-06 2021-12-10 北京量子之歌科技有限公司 Teaching material generation method and device, computer equipment and storage medium
CN113779345B (en) * 2021-09-06 2024-04-16 北京量子之歌科技有限公司 Teaching material generation method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110493640A (en) A kind of system and method that the Video Quality Metric based on video processing is PPT
CN101271525B (en) Fast image sequence characteristic remarkable picture capturing method
CN112465008B (en) Voice and visual relevance enhancement method based on self-supervision course learning
CN111539370A (en) Image pedestrian re-identification method and system based on multi-attention joint learning
CN109117777A (en) The method and apparatus for generating information
EP2246807A1 (en) Information processing apparatus and method, and program
CN112261477B (en) Video processing method and device, training method and storage medium
CN112183468A (en) Pedestrian re-identification method based on multi-attention combined multi-level features
CN111160533A (en) Neural network acceleration method based on cross-resolution knowledge distillation
CN109165573A (en) Method and apparatus for extracting video feature vector
CN108922559A (en) Recording terminal clustering method based on voice time-frequency conversion feature and integral linear programming
CN108921038A (en) A kind of classroom based on deep learning face recognition technology is quickly called the roll method of registering
CN109977832A (en) A kind of image processing method, device and storage medium
CN115564993A (en) Lip print image classification algorithm based on multi-scale feature fusion and attention mechanism
CN111950487A (en) Intelligent teaching analysis management system
CN108363771B (en) Image retrieval method for public security investigation application
CN114519880A (en) Active speaker identification method based on cross-modal self-supervision learning
CN116229319A (en) Multi-scale feature fusion class behavior detection method and system
CN112183450A (en) Multi-target tracking method
Cheng et al. The dku audio-visual wake word spotting system for the 2021 misp challenge
WO2022205329A1 (en) Object detection method, object detection apparatus, and object detection system
Huang et al. DS-UNet: A dual streams UNet for refined image forgery localization
CN114266952A (en) Real-time semantic segmentation method based on deep supervision
CN109522865A (en) A kind of characteristic weighing fusion face identification method based on deep neural network
CN110689066B (en) Training method combining face recognition data equalization and enhancement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Ao Xin

Inventor after: Zhu Hongtou

Inventor after: Wu Yongman

Inventor after: Huang Xinjie

Inventor after: Chen Dian

Inventor after: Ye Yongfu

Inventor before: Ao Xin

Inventor before: Zhu Hongtou

Inventor before: Wu Yongman

Inventor before: Huang Xinjie

Inventor before: Chen Dian

CB03 Change of inventor or designer information
RJ01 Rejection of invention patent application after publication

Application publication date: 20191122

RJ01 Rejection of invention patent application after publication