CN111144360A - Multimode information identification method and device, storage medium and electronic equipment - Google Patents
Multimode information identification method and device, storage medium and electronic equipment Download PDFInfo
- Publication number
- CN111144360A CN111144360A CN201911410099.1A CN201911410099A CN111144360A CN 111144360 A CN111144360 A CN 111144360A CN 201911410099 A CN201911410099 A CN 201911410099A CN 111144360 A CN111144360 A CN 111144360A
- Authority
- CN
- China
- Prior art keywords
- information
- target
- early warning
- warning model
- characteristic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000012549 training Methods 0.000 claims description 28
- 238000012545 processing Methods 0.000 claims description 19
- 238000012790 confirmation Methods 0.000 claims description 15
- 230000006399 behavior Effects 0.000 claims description 9
- 230000001960 triggered effect Effects 0.000 claims description 7
- 230000000694 effects Effects 0.000 claims description 6
- 238000004891 communication Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 abstract description 17
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
- G06V20/43—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of news video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/413—Classification of content, e.g. text, photographs or tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
Abstract
The invention provides a multimode information identification method, a device, a storage medium and electronic equipment. Therefore, the technical scheme of automatically screening the multimode information based on the early warning model is provided, and the screening efficiency is improved.
Description
Technical Field
The invention relates to the technical field of information identification, in particular to a multimode information identification method, a multimode information identification device, a storage medium and electronic equipment.
Background
Currently, some specific fields, such as the public security field, need to identify the content of the video, and the current content identification is manually filtered, and the filtering efficiency is low.
Therefore, how to provide a method for identifying multimodal information, which can automatically screen multimodal information and improve the screening efficiency, is a great technical problem to be solved by those skilled in the art.
Disclosure of Invention
In view of this, the embodiment of the present invention provides a method for identifying multimodal information, which can automatically perform screening of multimodal information, and improve screening efficiency.
In order to achieve the above purpose, the embodiments of the present invention provide the following technical solutions:
a multimodal information identification method comprising:
acquiring characteristic parameters of a video to be recognized, wherein the characteristic parameters at least comprise one or more of face characteristic information, character behavior characteristic information, activity scene characteristic information, subtitle content characteristic information, video background characteristic information and audio characteristic information;
and inputting the characteristic parameters into a trained target information early warning model, and outputting a target video expression tendency by the target early warning model.
Optionally, the method further includes:
acquiring a user confirmation result triggered by the user based on the target video expression tendency;
generating a positive and negative sample set based on the user confirmation result;
and training the target information early warning model based on the positive and negative sample sets.
Optionally, the inputting the characteristic parameters into a trained target information early warning model, and outputting a target video expression tendency by the target early warning model includes:
acquiring the weight of each characteristic parameter in the target information early warning model;
and based on the weight, carrying out weighting processing on the characteristic parameters input into the target information early warning model, and outputting a target video expression tendency.
Optionally, the training the target information early warning model based on the positive and negative sample sets includes:
acquiring index information of each video in the positive and negative sample sets;
and training the target information early warning model based on the index information.
A multimode information identification device comprising:
the first acquisition module is used for acquiring characteristic parameters of a video to be identified, wherein the characteristic parameters at least comprise one or more of face characteristic information, character behavior characteristic information, activity scene characteristic information, subtitle content characteristic information, video background characteristic information and audio characteristic information;
and the processing module is used for inputting the characteristic parameters into the trained target information early warning model and outputting the target video expression tendency by the target early warning model.
Optionally, the method further includes:
the second acquisition module is used for acquiring a user confirmation result triggered by the user based on the target video expression tendency;
the generating module is used for generating a positive and negative sample set based on the user confirmation result;
and the training module is used for training the target information early warning model based on the positive and negative sample sets.
Optionally, the processing module includes:
the first acquisition unit is used for acquiring the weight of each characteristic parameter in the target information early warning model;
and the processing unit is used for performing weighting processing on the characteristic parameters input into the target information early warning model based on the weight and outputting a target video expression tendency.
Optionally, the training module includes:
the second acquisition unit is used for acquiring index information of each video in the positive and negative sample sets;
and the training unit is used for training the target information early warning model based on the index information.
A storage medium comprising a stored program, wherein a device on which the storage medium is located is controlled to execute any one of the above multimode information identification methods when the program runs.
An electronic device comprising at least one processor, and at least one memory, bus connected to the processor; the processor and the memory complete mutual communication through the bus; the processor is configured to call program instructions in the memory to perform any one of the above multimodal information identification methods.
Based on the technical scheme, the invention provides a multimode information identification method, a device, a storage medium and electronic equipment. Therefore, the technical scheme of automatically screening the multimode information based on the early warning model is provided, and the screening efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic flowchart of a multimode information identification method according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a multimode information identification method according to an embodiment of the invention;
fig. 3 is a schematic flowchart of a multimode information identification method according to an embodiment of the invention;
fig. 4 is a schematic flowchart of a multimode information identification method according to an embodiment of the invention;
fig. 5 is a schematic structural diagram of a multimode information identification device according to an embodiment of the invention;
FIG. 6 is a diagram illustrating a multimodal information identification system according to an embodiment of the present invention;
fig. 7 is a hardware schematic diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
Referring to fig. 1, fig. 1 is a schematic flow chart of a multimode information identification method according to an embodiment of the present invention, where the multimode information identification method is based on an early warning model, and a technical scheme for automatically screening multimode information improves screening efficiency, and specifically includes the following steps:
and S11, acquiring the characteristic parameters of the video to be identified.
In this embodiment, the feature parameters may include face feature information, character behavior feature information, moving scene feature information, subtitle content feature information, video background feature information, and audio feature information.
Specifically, each characteristic parameter of the video, including pictures, subtitles, audio, titles, and the like, can be extracted by constructing a multivariate data identification channel. For example, the video vehicle feature and the human face feature and the sensitive scene feature in the video are analyzed and identified by the specific object identification model based on the feature pyramid and the video scene identification model based on the deep neural network.
For another example, a semi-supervised algorithm and a data popularization method can be adopted to realize the conversion of the recognition of the middle and English voices in the video into the text, and a neural network algorithm and a line-distinguishing training method based on the input sequence pointer are adopted to realize the voice recognition, the language recognition and the sensitive word retrieval of the voice data in the complex channel environment of the video.
For another example, a multi-mode comprehensive algorithm is adopted based on an OCR recognition framework of deep learning, and a function of converting a dimensional image into a text with high accuracy is realized aiming at characteristics of Uyghur grammar, writing and the like. Based on a neural network machine translation model, a neural network algorithm of an input sequence pointer is adopted, and an attention mechanism method is used to realize semantic understanding of a large segment of dimensional language and translate the semantic understanding into a Chinese function.
Of course, the embodiment of the present invention may also adopt other identification manners of the characteristic parameters, and is not limited to the foregoing exemplary manner.
And S12, inputting the characteristic parameters into the trained target information early warning model, and outputting the target video expression tendency by the target early warning model.
It should be noted that, in the embodiment of the present invention, the target information early warning model needs to be trained in advance, then the obtained characteristic parameters are input into the trained model, and then the model outputs the prediction result. The target video expression tendency is an expression mode for representing video content, such as mark identification, yellow identification and the like.
For example, the model is established by screening, identifying and classifying data of the identification channel to form effective metadata, then an expert system is built according to a front-line practical application experience, multi-mode data is subjected to fusion analysis, and a social security sensitive information early warning model suitable for the social security under a mobile internet social platform is established.
Therefore, the scheme can comprehensively judge the video expression content through multi-angle recognition of characters, behaviors, scenes, subtitles, backgrounds, voices and the like in the short video based on an artificial intelligence technology, achieves automatic screening of the video, and improves screening efficiency compared with manual screening.
On the basis of the above embodiment, as shown in fig. 2, an embodiment of the present invention further provides a specific implementation manner in which the feature parameters are input into a trained target information early warning model, and the target early warning model outputs a target video expression tendency, including the steps of:
s21, acquiring the weight of each characteristic parameter in the target information early warning model;
and S22, based on the weight, carrying out weighting processing on the characteristic parameters input into the target information early warning model, and outputting a target video expression tendency.
In the embodiment of the present invention, there are many product identification channels, that is, many factors for determining the determination result, so it is necessary to provide a weighted integration determination method as a standard for video representation tendency. For example, the content item contains a human-vehicle factor weight of 60% (face weight, vehicle weight), a keyword factor of 20% (speech, caption, title, comment), a scene factor of 20% (multi-action scene, logo, building).
And after the weight is determined, performing weighting processing on the characteristic parameters input into the target information early warning model, and further outputting the target video expression tendency.
Specifically, in order to further improve the output accuracy of the target information model, as shown in fig. 3, the embodiment of the present invention may further include, on the basis of the above multimode information identification method, the steps of:
s31, acquiring a user confirmation result triggered by the user based on the target video expression tendency;
s32, generating a positive and negative sample set based on the user confirmation result;
and S33, training the target information early warning model based on the positive and negative sample sets.
The training of the target information early warning model based on the positive and negative sample sets may be implemented in a manner as shown in fig. 4, including:
s41, acquiring index information of each video in the positive and negative sample set;
and S42, training the target information early warning model based on the index information.
Namely, the system identifies the video through a video content identification channel, such as extracting and identifying human faces in short video, identifying character behavior characteristics, identifying characteristics of active scenes, identifying caption content, identifying background characteristics of the video and identifying voice content in the video.
And then the recognition result passes through an information early warning model, the video expression tendency is comprehensively judged by weighting each item, then the recognized target video is displayed to a user, the recognition result is confirmed by the user, meanwhile, the system stores the video confirmation result into a positive and negative sample library, and the recognition channel algorithm is optimized by manual participation to improve the recognition accuracy.
On the basis of the above embodiment, as shown in fig. 5, an embodiment of the present invention further provides a multimode information identification apparatus, including:
the first obtaining module 51 is configured to obtain feature parameters of a video to be identified, where the feature parameters at least include one or more of face feature information, character behavior feature information, activity scene feature information, subtitle content feature information, video background feature information, and audio feature information;
and the processing module 52 is configured to input the characteristic parameters into the trained target information early warning model, and output a target video expression tendency by the target early warning model.
In addition, the multimode information identification device may further include:
the second acquisition module is used for acquiring a user confirmation result triggered by the user based on the target video expression tendency;
the generating module is used for generating a positive and negative sample set based on the user confirmation result;
and the training module is used for training the target information early warning model based on the positive and negative sample sets.
Wherein the processing module may include:
the first acquisition unit is used for acquiring the weight of each characteristic parameter in the target information early warning model;
and the processing unit is used for performing weighting processing on the characteristic parameters input into the target information early warning model based on the weight and outputting a target video expression tendency.
In addition, the training module may include:
the second acquisition unit is used for acquiring index information of each video in the positive and negative sample sets;
and the training unit is used for training the target information early warning model based on the index information.
The working principle of the device is described in the above embodiments of the method, and will not be described repeatedly.
On the basis of the above embodiments, the embodiment of the present invention further provides a multimode information identification system, the structure of which is shown in fig. 6, and the multimode information identification system includes a database 61, a video content identification channel 62, an information early warning model 63, a positive and negative sample library 64, and an identification channel optimization module 65.
The multimode information identification device comprises a processor and a memory, wherein the first acquisition module, the processing module and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more than one, and the screening of the multimode information is automatically carried out by adjusting the kernel parameters, so that the screening efficiency is improved.
An embodiment of the present invention provides a storage medium having a program stored thereon, which when executed by a processor implements the multimode information identification method.
The embodiment of the invention provides a processor, which is used for running a program, wherein the multimode information identification method is executed when the program runs.
An embodiment of the present invention provides an electronic device, as shown in fig. 7, the device includes at least one processor 71, and at least one memory 72 and a bus 73 connected to the processor; the processor and the memory complete mutual communication through a bus; the processor is used for calling the program instructions in the memory to execute the multimode information identification method. The device herein may be a server, a PC, a PAD, a mobile phone, etc.
The present application further provides a computer program product adapted to perform a program for initializing the following method steps when executed on a data processing device:
acquiring characteristic parameters of a video to be recognized, wherein the characteristic parameters at least comprise one or more of face characteristic information, character behavior characteristic information, activity scene characteristic information, subtitle content characteristic information, video background characteristic information and audio characteristic information;
and inputting the characteristic parameters into a trained target information early warning model, and outputting a target video expression tendency by the target early warning model.
Optionally, the method further includes:
acquiring a user confirmation result triggered by the user based on the target video expression tendency;
generating a positive and negative sample set based on the user confirmation result;
and training the target information early warning model based on the positive and negative sample sets.
Optionally, the inputting the characteristic parameters into a trained target information early warning model, and outputting a target video expression tendency by the target early warning model includes:
acquiring the weight of each characteristic parameter in the target information early warning model;
and based on the weight, carrying out weighting processing on the characteristic parameters input into the target information early warning model, and outputting a target video expression tendency.
Optionally, the training the target information early warning model based on the positive and negative sample sets includes:
acquiring index information of each video in the positive and negative sample sets;
and training the target information early warning model based on the index information.
In summary, the present invention provides a method, an apparatus, a storage medium and an electronic device for identifying multi-mode information, wherein the identifying method first obtains characteristic parameters of a video to be identified, then inputs the characteristic parameters into a trained target information early warning model, and outputs a target video expression tendency through the target early warning model. Therefore, the technical scheme of automatically screening the multimode information based on the early warning model is provided, and the screening efficiency is improved.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1. A multimode information identification method is characterized by comprising the following steps:
acquiring characteristic parameters of a video to be recognized, wherein the characteristic parameters at least comprise one or more of face characteristic information, character behavior characteristic information, activity scene characteristic information, subtitle content characteristic information, video background characteristic information and audio characteristic information;
and inputting the characteristic parameters into a trained target information early warning model, and outputting a target video expression tendency by the target early warning model.
2. The multimodal information identification method as claimed in claim 1, further comprising:
acquiring a user confirmation result triggered by the user based on the target video expression tendency;
generating a positive and negative sample set based on the user confirmation result;
and training the target information early warning model based on the positive and negative sample sets.
3. The multimodal information identification method according to claim 1, wherein the inputting the feature parameters into a trained target information early warning model, and outputting target video expression tendency by the target early warning model comprises:
acquiring the weight of each characteristic parameter in the target information early warning model;
and based on the weight, carrying out weighting processing on the characteristic parameters input into the target information early warning model, and outputting a target video expression tendency.
4. The multimodal information identification method according to claim 2, wherein the training the target information early warning model based on the positive and negative sample sets comprises:
acquiring index information of each video in the positive and negative sample sets;
and training the target information early warning model based on the index information.
5. A multimode information recognition device, comprising:
the first acquisition module is used for acquiring characteristic parameters of a video to be identified, wherein the characteristic parameters at least comprise one or more of face characteristic information, character behavior characteristic information, activity scene characteristic information, subtitle content characteristic information, video background characteristic information and audio characteristic information;
and the processing module is used for inputting the characteristic parameters into the trained target information early warning model and outputting the target video expression tendency by the target early warning model.
6. The multimode information identification device of claim 5, further comprising:
the second acquisition module is used for acquiring a user confirmation result triggered by the user based on the target video expression tendency;
the generating module is used for generating a positive and negative sample set based on the user confirmation result;
and the training module is used for training the target information early warning model based on the positive and negative sample sets.
7. The multimodal information apparatus of claim 5 wherein the processing module comprises:
the first acquisition unit is used for acquiring the weight of each characteristic parameter in the target information early warning model;
and the processing unit is used for performing weighting processing on the characteristic parameters input into the target information early warning model based on the weight and outputting a target video expression tendency.
8. The multimodal information recognition apparatus of claim 6 wherein the training module comprises:
the second acquisition unit is used for acquiring index information of each video in the positive and negative sample sets;
and the training unit is used for training the target information early warning model based on the index information.
9. A storage medium, characterized in that the storage medium comprises a stored program, wherein when the program runs, a device where the storage medium is located is controlled to execute the multimode information identification method according to any one of claims 1 to 4.
10. An electronic device comprising at least one processor, and at least one memory, bus connected to the processor; the processor and the memory complete mutual communication through the bus; the processor is configured to call program instructions in the memory to perform the multimodal information identification method according to any one of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911410099.1A CN111144360A (en) | 2019-12-31 | 2019-12-31 | Multimode information identification method and device, storage medium and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911410099.1A CN111144360A (en) | 2019-12-31 | 2019-12-31 | Multimode information identification method and device, storage medium and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111144360A true CN111144360A (en) | 2020-05-12 |
Family
ID=70522490
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911410099.1A Pending CN111144360A (en) | 2019-12-31 | 2019-12-31 | Multimode information identification method and device, storage medium and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111144360A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112633081A (en) * | 2020-12-07 | 2021-04-09 | 深圳市来科计算机科技有限公司 | Specific object identification method in complex scene |
CN112765394A (en) * | 2021-01-07 | 2021-05-07 | 上海喜日电子科技有限公司 | Data processing method and device, electronic equipment and storage medium |
CN112765402A (en) * | 2020-12-31 | 2021-05-07 | 北京奇艺世纪科技有限公司 | Sensitive information identification method, device, equipment and storage medium |
WO2022184254A1 (en) * | 2021-03-03 | 2022-09-09 | Robidia GmbH | Method for controlling a camera robot |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110249886A1 (en) * | 2010-04-12 | 2011-10-13 | Samsung Electronics Co., Ltd. | Image converting device and three-dimensional image display device including the same |
CN107801090A (en) * | 2017-11-03 | 2018-03-13 | 北京奇虎科技有限公司 | Utilize the method, apparatus and computing device of audio-frequency information detection anomalous video file |
CN108040262A (en) * | 2018-01-25 | 2018-05-15 | 湖南机友科技有限公司 | Live audio and video are reflected yellow method and device in real time |
CN108510194A (en) * | 2018-03-30 | 2018-09-07 | 平安科技(深圳)有限公司 | Air control model training method, Risk Identification Method, device, equipment and medium |
CN109711297A (en) * | 2018-12-14 | 2019-05-03 | 深圳壹账通智能科技有限公司 | Risk Identification Method, device, computer equipment and storage medium based on facial picture |
CN110263220A (en) * | 2019-06-28 | 2019-09-20 | 北京奇艺世纪科技有限公司 | A kind of video highlight segment recognition methods and device |
-
2019
- 2019-12-31 CN CN201911410099.1A patent/CN111144360A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110249886A1 (en) * | 2010-04-12 | 2011-10-13 | Samsung Electronics Co., Ltd. | Image converting device and three-dimensional image display device including the same |
CN107801090A (en) * | 2017-11-03 | 2018-03-13 | 北京奇虎科技有限公司 | Utilize the method, apparatus and computing device of audio-frequency information detection anomalous video file |
CN108040262A (en) * | 2018-01-25 | 2018-05-15 | 湖南机友科技有限公司 | Live audio and video are reflected yellow method and device in real time |
CN108510194A (en) * | 2018-03-30 | 2018-09-07 | 平安科技(深圳)有限公司 | Air control model training method, Risk Identification Method, device, equipment and medium |
CN109711297A (en) * | 2018-12-14 | 2019-05-03 | 深圳壹账通智能科技有限公司 | Risk Identification Method, device, computer equipment and storage medium based on facial picture |
CN110263220A (en) * | 2019-06-28 | 2019-09-20 | 北京奇艺世纪科技有限公司 | A kind of video highlight segment recognition methods and device |
Non-Patent Citations (2)
Title |
---|
文孟飞等: "一种基于深度学习的异构多模态目标识别方法", 《中南大学学报(自然科学版)》, vol. 47, no. 05, 31 May 2016 (2016-05-31), pages 1580 - 1587 * |
杨磊等: "《数字媒体技术概论》", vol. 1, 北京:中国铁道出版社, pages: 41 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112633081A (en) * | 2020-12-07 | 2021-04-09 | 深圳市来科计算机科技有限公司 | Specific object identification method in complex scene |
CN112765402A (en) * | 2020-12-31 | 2021-05-07 | 北京奇艺世纪科技有限公司 | Sensitive information identification method, device, equipment and storage medium |
CN112765394A (en) * | 2021-01-07 | 2021-05-07 | 上海喜日电子科技有限公司 | Data processing method and device, electronic equipment and storage medium |
WO2022184254A1 (en) * | 2021-03-03 | 2022-09-09 | Robidia GmbH | Method for controlling a camera robot |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108536681B (en) | Intelligent question-answering method, device, equipment and storage medium based on emotion analysis | |
CN111144360A (en) | Multimode information identification method and device, storage medium and electronic equipment | |
CN110517689B (en) | Voice data processing method, device and storage medium | |
CN110364146B (en) | Speech recognition method, speech recognition device, speech recognition apparatus, and storage medium | |
US20200135158A1 (en) | System and Method of Reading Environment Sound Enhancement Based on Image Processing and Semantic Analysis | |
US20230103340A1 (en) | Information generating method and apparatus, device, storage medium, and program product | |
CN110472008B (en) | Intelligent interaction method and device | |
CN102855317A (en) | Multimode indexing method and system based on demonstration video | |
CN115713715A (en) | Human behavior recognition method and system based on deep learning | |
CN112150457A (en) | Video detection method, device and computer readable storage medium | |
CN112328830A (en) | Information positioning method based on deep learning and related equipment | |
CN113128284A (en) | Multi-mode emotion recognition method and device | |
CN110781329A (en) | Image searching method and device, terminal equipment and storage medium | |
KR102395410B1 (en) | System and method for providing sign language avatar using non-marker | |
CN112261321B (en) | Subtitle processing method and device and electronic equipment | |
CN114267324A (en) | Voice generation method, device, equipment and storage medium | |
CN113672086A (en) | Page processing method, device, equipment and medium | |
Hukkeri et al. | Erratic navigation in lecture videos using hybrid text based index point generation | |
CN111666469B (en) | Statement library construction method, device, equipment and storage medium | |
CN113378826B (en) | Data processing method, device, equipment and storage medium | |
CN116127366B (en) | Emotion recognition method, system and medium based on TWS earphone | |
CN113488025B (en) | Text generation method, device, electronic equipment and readable storage medium | |
CN116913278B (en) | Voice processing method, device, equipment and storage medium | |
CN116978007A (en) | Image content identification method and device, electronic equipment, storage medium and product | |
Park et al. | Qestion type classification for comic QA system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200512 |
|
RJ01 | Rejection of invention patent application after publication |