CN111464862A - Video screenshot method based on voice recognition and image processing - Google Patents
Video screenshot method based on voice recognition and image processing Download PDFInfo
- Publication number
- CN111464862A CN111464862A CN202010330355.2A CN202010330355A CN111464862A CN 111464862 A CN111464862 A CN 111464862A CN 202010330355 A CN202010330355 A CN 202010330355A CN 111464862 A CN111464862 A CN 111464862A
- Authority
- CN
- China
- Prior art keywords
- video
- screenshot
- target
- face image
- instruction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/433—Content storage operation, e.g. storage operation in response to a pause request, caching operations
- H04N21/4334—Recording operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/44—Event detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Abstract
The invention relates to a video screenshot method based on voice recognition and image processing, which comprises the steps of receiving a video screenshot voice instruction, carrying out voice recognition on the video screenshot voice instruction to obtain screenshot instruction text data, playing a video file and extracting each video image frame if the screenshot instruction text data is valid text data, further extracting a face image in the video image frames, determining the video image frame containing a target face image according to a preset target face image database, carrying out screenshot on the video file according to the determined video image frame to obtain a target screenshot, establishing a target screenshot database according to the target screenshot, and finally outputting the target screenshot database. According to the video screenshot method, manual screenshot of operators is not needed, the personnel input cost is reduced, and the labor intensity of the operators is reduced. Because the face images are compared automatically, the situation that partial images are omitted due to human factors can be avoided, and the accuracy of screenshot is greatly improved.
Description
Technical Field
The invention relates to a video screenshot method based on voice recognition and image processing.
Background
At present, the application of video processing technology is more and more extensive. In the field of video processing, a video file needs to be processed in many cases to obtain relevant data information in the video file. In many scenarios, it is necessary to capture an image of a video file that relates to relevant information for subsequent use. The conventional video screenshot method is a manual screenshot mode, an operator watches the video file, and when a certain frame of image contains related information, the screenshot is manually performed, the manual screenshot mode needs the operator to sit beside a computer specially to watch the video file, moreover, the operator needs to be highly centralized, the operator needs to pay great energy, the situation that partial images are missed due to negligence easily occurs, and the screenshot accuracy is low.
Disclosure of Invention
The invention aims to provide a video screenshot method based on voice recognition and image processing, which is used for solving the problems that a manual screenshot mode requires great effort of operators, partial images are easily missed due to negligence, and the screenshot accuracy is low.
In order to solve the problems, the invention adopts the following technical scheme:
a video screenshot method based on voice recognition and image processing comprises the following steps:
receiving a video screenshot voice instruction;
carrying out voice recognition on the video screenshot voice instruction to obtain screenshot instruction text data;
inputting the screenshot instruction text data into a preset video screenshot instruction special dictionary for comparison, and if at least one word in the video screenshot instruction special dictionary exists in the screenshot instruction text data, judging the screenshot instruction text data to be effective text data;
converting the effective text data into a video screenshot control instruction;
playing a preset video file according to the video screenshot control instruction;
extracting each video image frame of the video file in the video file playing process;
extracting a face image contained in each video image frame according to each video image frame;
inputting the face image contained in each video image frame into a preset target face image database for comparison, determining the video image frame containing at least one target face image in the target face image database, and obtaining a target video image frame; wherein the target face image database comprises at least one target face image;
capturing the video file according to the obtained target video image frame to obtain a target capture corresponding to the target video image frame;
establishing a target screenshot database according to the obtained target screenshot;
and outputting the target cut-map database.
Preferably, the inputting the face image included in each video image frame into a preset target face image database for comparison includes:
for any face image in any video image frame, marking feature coordinates of each key feature in the face image and each target face image in the target face image database based on a preset face key feature list;
calculating a feature distance value between the feature coordinate of each key feature of the face image and the feature coordinate of each key feature in the target face image for any target face image in the target face image database; calculating to obtain a target average value of the characteristic distance values; obtaining the matching degree corresponding to the target average value according to the corresponding relation between the preset average value and the matching degree; the preset corresponding relation between the average value and the matching degree comprises at least two average value intervals and the matching degree corresponding to each average value interval, and the average value intervals and the matching degree are in an anti-correlation relation;
obtaining the matching degree of the face image corresponding to each target face image in the target face image database;
and if at least one matching degree which is greater than or equal to a preset matching degree threshold value exists, the arbitrary video image frame is the target video image frame.
Preferably, the step of inputting the screenshot instruction text data into a preset video screenshot instruction special dictionary for comparison includes:
and comparing each word in the video screenshot instruction special dictionary with the screenshot instruction text data to obtain whether the word in the video screenshot instruction special dictionary exists in the screenshot instruction text data or not.
Preferably, the words in the video screenshot instruction specific dictionary comprise a screenshot.
Preferably, the words in the video screenshot instruction specific dictionary further include the words associated with the screenshot.
The invention has the beneficial effects that: when a video file needs to be subjected to screenshot, a video screenshot voice command is spoken, the video screenshot voice command is subjected to voice recognition to obtain screenshot command text data, then the screenshot command text data needs to be judged, comparison is carried out according to a preset video screenshot command special dictionary, if at least one word in the video screenshot command special dictionary exists in the screenshot command text data, the screenshot command text data is judged to be effective text data, the effective text data is converted into a video screenshot control command, a preset video file is played according to the video screenshot control command, and the screenshot is controlled and started through the voice recognition, so that compared with the traditional mode of starting video playing and manual screenshot by clicking the video file, the intelligent degree is greatly improved, and manual operation is not needed, the control convenience is improved; in the video file playing process, extracting each video image frame of a video file, extracting a face image contained in each video image frame, inputting the face image contained in each video image frame into a preset target face image database for comparison, determining which video image frames contain at least one target face image in the target face image database according to the comparison of the face images, wherein the video image frames containing at least one target face image in the target face image database are required target video image frames, capturing the video file according to the obtained target video image frames, obtaining a target screenshot corresponding to the target video image frames, further establishing a target screenshot database, wherein each target screenshot is contained in the target screenshot database, and finally outputting the target screenshot database. Therefore, the video screenshot method provided by the invention is an automatic screenshot method, automatic screenshot is carried out according to the comparison result of the face images to obtain the required screenshot, manual screenshot of operators is not needed, the personnel input cost is reduced, the labor intensity of the operators is reduced, and moreover, the situation that partial images are omitted due to human factors is avoided due to automatic comparison of the face images, so that the screenshot accuracy is greatly improved.
Drawings
In order to more clearly illustrate the technical solution of the embodiment of the present invention, the drawings needed to be used in the embodiment will be briefly described as follows:
fig. 1 is a flow diagram of a video capture method based on speech recognition and image processing.
Detailed Description
The embodiment provides a video screenshot method based on voice recognition and image processing, and an execution main body of the video screenshot method can be a desktop computer, a notebook computer, an intelligent mobile terminal and the like. Because the voice signal needs to be acquired, a voice acquisition device such as a microphone needs to be arranged on the execution main body, for example, a microphone carried by a notebook computer or an intelligent mobile terminal. Because the video file playing needs to be controlled, the execution main body can be provided with video playing applications, such as some mainstream video playing software programs at present, if a plurality of video playing applications are installed, one of the video playing applications is designated as default playing software of the video file, and the video playing application is started in the subsequent control.
As shown in fig. 1, the video capture method includes the following steps:
receiving a video screenshot voice instruction:
the execution main body stores a preset video file, namely a video file needing screenshot. When the video file needs to be subjected to screenshot, an operator speaks a video screenshot voice instruction. And the microphone of the execution main body or the microphone provided by the execution main body acquires the video screenshot voice instruction of the operator.
Carrying out voice recognition on the video screenshot voice instruction to obtain screenshot instruction text data:
the execution main body is internally provided with the existing voice recognition algorithm, and the acquired video screenshot voice instruction is subjected to voice recognition according to the voice recognition algorithm to obtain screenshot instruction text data.
Inputting the screenshot instruction text data into a preset video screenshot instruction special dictionary for comparison, and if at least one word in the video screenshot instruction special dictionary exists in the screenshot instruction text data, judging that the screenshot instruction text data is effective text data:
the execution main body is preset with a special dictionary for the video screenshot instruction, the special dictionary for the video screenshot instruction comprises at least one word, each word in the special dictionary for the video screenshot instruction is a related word of a control instruction of a video screenshot, and as a specific implementation mode, the words in the special dictionary for the video screenshot instruction comprise a screenshot, and further comprise words related to the screenshot, such as the screenshot, the screen capture and the like.
The method includes the steps of inputting screenshot instruction text data into a video screenshot instruction special dictionary for comparison, comparing each word in the video screenshot instruction special dictionary with screenshot instruction text data, namely, inputting any word in the video screenshot instruction special dictionary into the screenshot instruction text data, and judging whether the word exists in the screenshot instruction text data or not. Then, whether the words in the video screenshot instruction special dictionary exist in the screenshot instruction text data is finally obtained.
If at least one word in the video screenshot instruction special dictionary exists in the screenshot instruction text data, namely the word in the video screenshot instruction special dictionary exists in the screenshot instruction text data, the screenshot instruction text data is judged to be effective text data. Such as: and inputting the video screenshot into a dictionary special for the video screenshot instruction to compare, and judging that the screenshot instruction text data is effective text data because the screenshot exists in the dictionary special for the video screenshot instruction in the video screenshot.
Converting the effective text data into a video screenshot control instruction:
the obtained effective text data is converted into a video capture control instruction, and as a specific implementation manner, the video capture control instruction can be a specific data string.
And playing a preset video file according to the video screenshot control instruction:
and controlling to start the installed or default video playing application according to the obtained video screenshot control instruction, and playing a preset video file after the video playing application is started.
In the video file playing process, extracting each video image frame of the video file:
in the process of playing the video file, each video image frame included in the video file is read, and each video playing frame is sequentially output at a preset video playing frame rate based on the frame number of each video image frame, for example, the video playing frame rate may be 60dps, that is, 60 video image frames are output per second.
According to each video image frame, extracting a face image contained in each video image frame:
the execution main body is internally provided with the existing face recognition algorithm, the face recognition algorithm can analyze and process each video image frame, and the face image contained in each video image frame is extracted and obtained. It should be understood that, for a certain video image frame, there may be only one person or a plurality of persons in the video image frame, and therefore, for any one video image frame, there may be only one face image or a plurality of (i.e., at least two) face images.
Inputting the face image contained in each video image frame into a preset target face image database for comparison, determining the video image frame containing at least one target face image in the target face image database, and obtaining a target video image frame; wherein the target face image database comprises at least one target face image:
a target face image database is preset in the execution main body, the target face image database comprises at least one target face image, and the specific setting number is set according to actual needs. Each target face image is a screenshot standard of a video screenshot, and for a certain video image frame, if the face image contained in the video image frame has a target face image in a target face image database, that is, at least one face image in the video image frame is a target face image in a target face image database, the video image frame is a required video image frame, and screenshot needs to be performed according to the video image frame.
The determination process of whether the face image of each video image frame has the target face image in the target face image database is the same, so that the following description will be given by taking any one of the video image frames as an example, and the determination process of other video image frames is the same.
The video image frame may only include one face image or at least two face images, and the following description will be given by taking an example that the video image frame only includes one face image, and if the video image frame includes at least two face images, the processing procedure of each face image is the same as that of each face image for the at least two face images, and then several processing procedures are performed including several face images. Then, for one face image in the video image frame, marking each key feature of the face image and each key feature of each target face image in a target face image database based on a preset face key feature list, and obtaining feature coordinates corresponding to the face image and each target face image in the target face image database according to the coordinates of each key feature in the image; the face key feature list may include: the four human face features of eyes, ears, mouth and nose can also comprise eyebrows, forehead and the like, and the specifically contained human face features can be according to actual needs.
Calculating a characteristic distance value between the characteristic coordinate of each key characteristic of the face image and the characteristic coordinate of each key characteristic of the target face image for any target face image in a target face image database, wherein the characteristic distance value of two characteristic coordinates can be calculated through coordinate distance calculation formulas such as Euclidean distance calculation formula; calculating to obtain an average value of the characteristic distance values according to the obtained characteristic distance values, wherein the average value is a target average value; presetting a corresponding relation between the average value and the matching degree, wherein the corresponding relation comprises at least two average value intervals and the matching degree corresponding to each average value interval, and the average value intervals and the matching degree are in an inverse correlation relation, namely the lower the average value interval, the smaller the distance representing each key feature in the two images is, the more similar the two images are, and the higher the corresponding matching degree is, for example: the corresponding relation comprises two average value intervals, namely [ x1, x2], (x 2, x3], wherein the matching degree corresponding to [ x1, x2] is y1, (x 2, x 3) is y2, x1 < x2 < x3, y1 > y2, and the specific numerical value of the average value interval and the specific numerical value of the matching degree are set according to actual requirements.
The obtaining process obtains the matching degree of the face image and a target face image, and the obtaining process of the matching degree of the face image and other target face images is the same as the obtaining process. Then, the matching degree between the face image and each target face image in the target face image database can be obtained, and if there are N target face images in the target face image database, N matching degrees are obtained.
If at least one matching degree which is greater than or equal to a preset matching degree threshold exists in the N matching degrees, namely if at least one matching degree which is greater than or equal to the matching degree threshold exists, the face image is highly similar to at least one target face image in a target face image database, and the video image frame is judged to be a target video image frame.
The above process is a judgment process of whether one video image frame is a target video image frame, and the judgment processes of other video image frames are the same, so that all video image frames containing at least one target face image in the target face image database are finally determined and obtained, and the obtained video image frames are determined to be the target video image frames.
The above description is given of a specific face image comparison process, and it should be understood that the present application is not limited to the specific face image comparison process, and as other embodiments, other existing face image comparison processes may also be adopted.
Capturing the video file according to the obtained target video image frame to obtain a target capture corresponding to the target video image frame:
and obtaining target video image frames according to the process, and then, capturing the video file according to the obtained target video image frames to obtain target screenshots corresponding to the target video image frames. For a certain target video image frame, the progress of the target video image frame in a video file can be determined according to the target video image frame, and then the video file is subjected to screenshot according to the progress to obtain a target screenshot corresponding to the target video image frame. As the video screenshot process belongs to the conventional technical means, the description is not repeated.
Establishing a target screenshot database according to the obtained target screenshot:
establishing a target screenshot database according to the obtained target screenshot, and giving a specific implementation process as follows: firstly, establishing a blank initial screenshot database, then adding each obtained target screenshot into the initial screenshot database, and finally updating the initial screenshot database to obtain a target screenshot database.
Outputting the target cut-map database:
and outputting the established target cut-off database, such as wired transmission or wireless transmission, to external related equipment, so that the external equipment or related personnel can perform subsequent processing according to the target cut-off database.
The above-mentioned embodiments are merely illustrative of the technical solutions of the present invention in a specific embodiment, and any equivalent substitutions and modifications or partial substitutions of the present invention without departing from the spirit and scope of the present invention should be covered by the claims of the present invention.
Claims (5)
1. A video screenshot method based on voice recognition and image processing is characterized by comprising the following steps:
receiving a video screenshot voice instruction;
carrying out voice recognition on the video screenshot voice instruction to obtain screenshot instruction text data;
inputting the screenshot instruction text data into a preset video screenshot instruction special dictionary for comparison, and if at least one word in the video screenshot instruction special dictionary exists in the screenshot instruction text data, judging the screenshot instruction text data to be effective text data;
converting the effective text data into a video screenshot control instruction;
playing a preset video file according to the video screenshot control instruction;
extracting each video image frame of the video file in the video file playing process;
extracting a face image contained in each video image frame according to each video image frame;
inputting the face image contained in each video image frame into a preset target face image database for comparison, determining the video image frame containing at least one target face image in the target face image database, and obtaining a target video image frame; wherein the target face image database comprises at least one target face image;
capturing the video file according to the obtained target video image frame to obtain a target capture corresponding to the target video image frame;
establishing a target screenshot database according to the obtained target screenshot;
and outputting the target cut-map database.
2. The method for capturing video image based on voice recognition and image processing as claimed in claim 1, wherein the step of inputting the face image included in each video image frame into a preset target face image database for comparison comprises:
for any face image in any video image frame, marking feature coordinates of each key feature in the face image and each target face image in the target face image database based on a preset face key feature list;
calculating a feature distance value between the feature coordinate of each key feature of the face image and the feature coordinate of each key feature in the target face image for any target face image in the target face image database; calculating to obtain a target average value of the characteristic distance values; obtaining the matching degree corresponding to the target average value according to the corresponding relation between the preset average value and the matching degree; the preset corresponding relation between the average value and the matching degree comprises at least two average value intervals and the matching degree corresponding to each average value interval, and the average value intervals and the matching degree are in an anti-correlation relation;
obtaining the matching degree of the face image corresponding to each target face image in the target face image database;
and if at least one matching degree which is greater than or equal to a preset matching degree threshold value exists, the arbitrary video image frame is the target video image frame.
3. The video screenshot method based on voice recognition and image processing as claimed in claim 1, wherein said inputting said screenshot command text data into a preset video screenshot command specific dictionary for comparison comprises:
and comparing each word in the video screenshot instruction special dictionary with the screenshot instruction text data to obtain whether the word in the video screenshot instruction special dictionary exists in the screenshot instruction text data or not.
4. The method of claim 1, wherein the words in the video shot instruction specific dictionary comprise a shot.
5. The method of claim 4, wherein the words in the video shot instruction specific dictionary further comprise words related to the shot.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010330355.2A CN111464862A (en) | 2020-04-24 | 2020-04-24 | Video screenshot method based on voice recognition and image processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010330355.2A CN111464862A (en) | 2020-04-24 | 2020-04-24 | Video screenshot method based on voice recognition and image processing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111464862A true CN111464862A (en) | 2020-07-28 |
Family
ID=71682606
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010330355.2A Withdrawn CN111464862A (en) | 2020-04-24 | 2020-04-24 | Video screenshot method based on voice recognition and image processing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111464862A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112203036A (en) * | 2020-09-14 | 2021-01-08 | 北京神州泰岳智能数据技术有限公司 | Method and device for generating text document based on video content |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104540004A (en) * | 2015-01-27 | 2015-04-22 | 深圳市中兴移动通信有限公司 | Video screenshot method and video screenshot device |
CN106610772A (en) * | 2015-10-21 | 2017-05-03 | 中兴通讯股份有限公司 | Screen capture method and apparatus, and intelligent terminal |
CN106650577A (en) * | 2016-09-22 | 2017-05-10 | 江苏理工学院 | Fast retrieval method and fast retrieval system for target person in monitoring video data file |
US20170237896A1 (en) * | 2015-05-08 | 2017-08-17 | Albert Tsai | System and Method for Preserving Video Clips from a Handheld Device |
CN108985176A (en) * | 2018-06-20 | 2018-12-11 | 北京优酷科技有限公司 | image generating method and device |
CN109584864A (en) * | 2017-09-29 | 2019-04-05 | 上海寒武纪信息科技有限公司 | Image processing apparatus and method |
CN109598223A (en) * | 2018-11-26 | 2019-04-09 | 北京洛必达科技有限公司 | Method and apparatus based on video acquisition target person |
-
2020
- 2020-04-24 CN CN202010330355.2A patent/CN111464862A/en not_active Withdrawn
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104540004A (en) * | 2015-01-27 | 2015-04-22 | 深圳市中兴移动通信有限公司 | Video screenshot method and video screenshot device |
US20170237896A1 (en) * | 2015-05-08 | 2017-08-17 | Albert Tsai | System and Method for Preserving Video Clips from a Handheld Device |
CN106610772A (en) * | 2015-10-21 | 2017-05-03 | 中兴通讯股份有限公司 | Screen capture method and apparatus, and intelligent terminal |
CN106650577A (en) * | 2016-09-22 | 2017-05-10 | 江苏理工学院 | Fast retrieval method and fast retrieval system for target person in monitoring video data file |
CN109584864A (en) * | 2017-09-29 | 2019-04-05 | 上海寒武纪信息科技有限公司 | Image processing apparatus and method |
CN108985176A (en) * | 2018-06-20 | 2018-12-11 | 北京优酷科技有限公司 | image generating method and device |
CN109598223A (en) * | 2018-11-26 | 2019-04-09 | 北京洛必达科技有限公司 | Method and apparatus based on video acquisition target person |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112203036A (en) * | 2020-09-14 | 2021-01-08 | 北京神州泰岳智能数据技术有限公司 | Method and device for generating text document based on video content |
CN112203036B (en) * | 2020-09-14 | 2023-05-26 | 北京神州泰岳智能数据技术有限公司 | Method and device for generating text document based on video content |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109902659B (en) | Method and apparatus for processing human body image | |
US11436863B2 (en) | Method and apparatus for outputting data | |
CN108346427A (en) | A kind of audio recognition method, device, equipment and storage medium | |
US8791914B2 (en) | Input method applied in electronic devices | |
US20170256262A1 (en) | System and Method for Speech-to-Text Conversion | |
KR102322773B1 (en) | Method and apparatus for detecting burrs of electrode pieces | |
EP3917131A1 (en) | Image deformation control method and device and hardware device | |
CN112148922A (en) | Conference recording method, conference recording device, data processing device and readable storage medium | |
CN110853614A (en) | Virtual object mouth shape driving method and device and terminal equipment | |
CN109993130A (en) | One kind being based on depth image dynamic sign language semantics recognition system and method | |
CN111464862A (en) | Video screenshot method based on voice recognition and image processing | |
CN110111778B (en) | Voice processing method and device, storage medium and electronic equipment | |
CN111666845A (en) | Small sample deep learning multi-mode sign language recognition method based on key frame sampling | |
CN111462754A (en) | Method for establishing dispatching control voice recognition model of power system | |
CN207718803U (en) | Multiple source speech differentiation identifying system | |
WO2019228135A1 (en) | Method and device for adjusting matching threshold, storage medium and electronic device | |
CN108227904A (en) | A kind of virtual reality language interactive system and method | |
CN106205610A (en) | A kind of voice information identification method and equipment | |
CN111144374B (en) | Facial expression recognition method and device, storage medium and electronic equipment | |
TWI769520B (en) | Multi-language speech recognition and translation method and system | |
CN111261187A (en) | Method, system, device and storage medium for converting voice into lip shape | |
US20230004738A1 (en) | System and method of image processing based emotion recognition | |
JP6754154B1 (en) | Translation programs, translation equipment, translation methods, and wearable devices | |
CN111898464A (en) | Face recognition method based on 5G network | |
CN112906650B (en) | Intelligent processing method, device, equipment and storage medium for teaching video |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20200728 |
|
WW01 | Invention patent application withdrawn after publication |