CN110442867A - Image processing method, device, terminal and computer storage medium - Google Patents
Image processing method, device, terminal and computer storage medium Download PDFInfo
- Publication number
- CN110442867A CN110442867A CN201910693744.9A CN201910693744A CN110442867A CN 110442867 A CN110442867 A CN 110442867A CN 201910693744 A CN201910693744 A CN 201910693744A CN 110442867 A CN110442867 A CN 110442867A
- Authority
- CN
- China
- Prior art keywords
- mood
- data
- target
- image
- processed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003860 storage Methods 0.000 title claims abstract description 30
- 238000003672 processing method Methods 0.000 title claims abstract description 19
- 230000036651 mood Effects 0.000 claims abstract description 663
- 230000008451 emotion Effects 0.000 claims abstract description 113
- 238000012545 processing Methods 0.000 claims abstract description 42
- 238000000034 method Methods 0.000 claims abstract description 38
- 238000004458 analytical method Methods 0.000 claims description 32
- 230000006399 behavior Effects 0.000 claims description 29
- 230000004044 response Effects 0.000 claims description 25
- 238000013507 mapping Methods 0.000 claims description 24
- 230000008921 facial expression Effects 0.000 claims description 18
- 238000000605 extraction Methods 0.000 claims description 16
- 239000000284 extract Substances 0.000 claims description 11
- 238000006243 chemical reaction Methods 0.000 claims description 10
- 230000004927 fusion Effects 0.000 claims description 5
- 206010027940 Mood altered Diseases 0.000 claims description 4
- 230000000694 effects Effects 0.000 abstract description 22
- 238000005516 engineering process Methods 0.000 abstract description 10
- 238000004422 calculation algorithm Methods 0.000 description 29
- 238000010586 diagram Methods 0.000 description 22
- 238000001228 spectrum Methods 0.000 description 9
- 238000013528 artificial neural network Methods 0.000 description 7
- 238000004590 computer program Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 230000003993 interaction Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 4
- 230000000306 recurrent effect Effects 0.000 description 4
- 230000001815 facial effect Effects 0.000 description 3
- 239000002245 particle Substances 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 235000013399 edible fruits Nutrition 0.000 description 2
- 230000002996 emotional effect Effects 0.000 description 2
- 239000004744 fabric Substances 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 210000000697 sensory organ Anatomy 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 238000013019 agitation Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000005194 fractionation Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000010355 oscillation Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- NGVDGCNFYWLIFO-UHFFFAOYSA-N pyridoxal 5'-phosphate Chemical compound CC1=NC=C(COP(O)(O)=O)C(C=O)=C1O NGVDGCNFYWLIFO-UHFFFAOYSA-N 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 235000015096 spirit Nutrition 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
- 230000002889 sympathetic effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/63—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Evolutionary Computation (AREA)
- Human Resources & Organizations (AREA)
- Life Sciences & Earth Sciences (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Primary Health Care (AREA)
- Psychiatry (AREA)
- Hospice & Palliative Care (AREA)
- Marketing (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computing Systems (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Child & Adolescent Psychology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention provides a kind of image processing method, device, terminal and computer storage mediums, wherein, the described method includes: obtaining mood data and image to be processed, identify the target emotion that the mood data is reflected, with according to the target emotion be the corresponding target filter mode of the images match to be processed, filter processing is finally carried out to the image to be processed using the target filter mode, obtains target image.Using the embodiment of the present invention, it is able to solve that image enhancement effects present in traditional technology are poor, can not accurately express the problems such as true intention of user.
Description
Technical field
The present invention relates to technical field of image processing more particularly to image processing method, device, terminal and computer storages
Medium.
Background technique
Social activity refers to the dealings of person to person in society, is people's (tool) transmitting information, friendship by way of certain
The meaning of thought is flowed, to reach the social Activities of certain purpose.With the development of science and technology with Internet resources in life
Application, interpersonal contacts beginning realizes by internet, can also be carried out by internet between stranger it is social, with
Further expand and develops oneself.
Stranger is during social activity at present, often by means of intelligent terminal.User is social in the stranger of intelligent terminal
Oneself is shown using the dynamic mode such as text, voice and image in, attraction is interacted with more sympathetic response persons.Wherein, scheme
It seem the most common selection of user, to issue personal dynamic.However find in practice: the image provided due to intelligent terminal
Filter mode is relatively more limited, causes image enhancement effects bad, and the image effect of user's publication is limited, can not accurately express user
True intention.To influence the enthusiasm of stranger's interaction, the utilization rate of stranger's social application is influenced, stranger is unfavorable for
Social development.
Summary of the invention
The embodiment of the invention provides a kind of image processing method, device, terminal and computer storage mediums, can improve
Image effect, and then promote the enthusiasm of user interaction, improve the utilization rate of social application.
On the one hand, the embodiment of the present invention, which discloses, provides a kind of image processing method, which comprises
It obtains mood data and image to be processed, the mood data includes mood voice data, mood image data
Or mood text data;
Identify the target emotion that the mood data is reflected;
It is the corresponding target filter mode of the images match to be processed according to the target emotion;
Filter processing is carried out to the image to be processed using the target filter mode, obtains target image.
On the other hand, the embodiment of the present invention, which is also disclosed, provides a kind of image processing apparatus, and described device includes:
Acquiring unit, for obtaining mood data and image to be processed, the mood data include mood text data,
Mood voice data or mood image data;
Recognition unit, the target emotion that the mood data is reflected for identification;
Matching unit, for being the corresponding target filter mould of the images match to be processed according to the target emotion
Formula;
Processing unit is obtained for carrying out filter processing to the image to be processed using the target filter mode
Target image.
In another aspect, the embodiment of the present invention, which is also disclosed, provides a kind of terminal, the terminal includes input equipment and output
Equipment, the terminal further include:
Processor is adapted for carrying out one or more instruction;And
Computer storage medium, the computer storage medium are stored with one or more instruction, and described one or more
Instruction is suitable for being loaded by the processor and executing following steps:
It obtains mood data and image to be processed, the mood data includes mood text data, mood voice data
Or mood image data;
Identify the target emotion that the mood data is reflected;
It is the corresponding target filter mode of the images match to be processed according to the target emotion;
Filter processing is carried out to the image to be processed using the target filter mode, obtains target image.Further
Aspect, the embodiment of the invention provides a kind of computer storage medium, the computer storage medium is stored with one or more
Instruction, one or more instruction are suitable for being loaded by processor and executing following steps:
It obtains mood data and image to be processed, the mood data includes mood text data, mood voice data
Or mood image data;
Identify the target emotion that the mood data is reflected;
It is the corresponding target filter mode of the images match to be processed according to the target emotion;
Filter processing is carried out to the image to be processed using the target filter mode, obtains target image.
The embodiment of the present invention can obtain mood data and image to be processed, and identify the mesh that the mood data is reflected
Mood is marked, for the corresponding target filter mode of the images match to be processed, finally to use institute according to the target emotion
It states target filter mode and filter processing is carried out to the image to be processed, obtain target image.In this way based on mood to image
Filter processing is carried out, can solve that image enhancement effects present in traditional technology are poor, can not accurately express the true meaning of user
The problems such as figure and influence interaction enthusiasm.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with
It obtains other drawings based on these drawings.
Fig. 1 is a kind of flow diagram of image processing method provided in an embodiment of the present invention.
Fig. 2 (a) and Fig. 2 (b) is the waveform diagram of two kinds of mood voice data provided in an embodiment of the present invention.
Fig. 3 is that a kind of mood provided in an embodiment of the present invention divides schematic diagram.
Fig. 4-Fig. 5 is the flow diagram of other two image processing method provided in an embodiment of the present invention.
Fig. 6 (a)-Fig. 6 (h) is a series of schematic diagram of a scenario provided in an embodiment of the present invention.
Fig. 7 is the flow diagram of another image processing method provided in an embodiment of the present invention.
Fig. 8 is the flow diagram of another image processing method provided in an embodiment of the present invention.
Fig. 9 is a kind of structural schematic diagram of image processing apparatus provided in an embodiment of the present invention.
Figure 10 is a kind of structural schematic diagram of terminal provided in an embodiment of the present invention.
Specific embodiment
In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present invention
Attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only
The embodiment of a part of the invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill people
The model that the present invention protects all should belong in member's every other embodiment obtained without making creative work
It encloses.
Description and claims of this specification and term " first " in above-mentioned attached drawing, " second " and " third " are (such as
Fruit presence) etc. be for distinguishing different objects, not for description particular order.In addition, term " includes " and they are any
Deformation, it is intended that cover and non-exclusive include.Such as contain the process, method, system, product of a series of steps or units
Or equipment is not limited to listed step or unit, but optionally further comprising the step of not listing or unit, or can
Selection of land further includes the other step or units intrinsic for these process, methods, product or equipment.
It referring to Figure 1, is a kind of flow diagram of image processing method provided in an embodiment of the present invention.The image procossing
Method can be executed by terminal.Method as shown in Figure 1 includes the following steps S101-S104.
S101, mood data and image to be processed are obtained.
Terminal can respond dynamic release instruction, obtain mood when detecting the dynamic release instruction in social application
Data and image to be processed.Wherein, dynamic release instruction, which can be, receives from other equipment (such as server) transmission,
It can be and detect that the dynamic release of user is operated and generated by terminal, dynamic release operation can refer to user and need in social activity
The corresponding operating issuing dynamic in and carrying out, such as according to the slide of desired guiding trajectory or needle in social application
To a series of clicking operations etc. of designated button in social application.
Mood data refers to the data for describing user emotion, and mood here refers to some column Subjective experiences
It is referred to as, is polyesthesia, the comprehensive psychology and physiological status generated of thought and act.For example, mood may include but be not limited to give birth to
Gas, happy, excitement are wished or other are used to describe the vocabulary of user psychology and physiological status.
In practical applications, the specific manifestation form of mood data without limitation, may include but be not limited in following
At least one of: mood voice data, mood image data, mood video data and mood text data.Wherein, video is usual
It is made of a frame frame image, therefore mood video data can be considered and be made of a frame frame mood image data.Terminal regards mood
The analysis essence of frequency evidence is the analysis to a frame frame mood image data, therefore the present invention is hereafter with mood image data substitution
The elaboration of related content is carried out for mood video data.
The target emotion that S102, identification mood data are reflected.
In one embodiment, if mood data includes mood voice data, step S102 specifically may include as follows
Step S11~S13:
S11, mood voice data is converted into mood text data, and extracts the text feature in mood text data.
Mood voice data is converted to corresponding mood text data by speech recognition program by terminal.The voice is known
Other program can be the system program of terminal disposition, be also possible to third party application, for realizing turning for speech-to-text
It changes.Further terminal extracts the text feature for including in mood text data using Text character extraction algorithm.The text is special
Take over the mood shown on text in reflection mood voice data for use.Text feature extraction algorithm is the customized setting of system
, such as customized setting according to actual needs, it may include but be not limited to Text eigenvector algorithm, Principal Component Analysis
Or other are used to extract the algorithm of text feature.
Acoustic feature in S12, extraction mood voice data.
Terminal extracts the acoustic feature in mood voice data using acoustic feature extraction algorithm, which extracts
The algorithm concretely customized setting of system, such as convolutional neural networks algorithm, Recognition with Recurrent Neural Network algorithm etc..
Optionally, which includes time domain acoustic feature and/or frequency domain acoustic feature.Wherein, time domain acoustic feature
Refer to the feature for being used to reflect user emotion that mood voice data is shown in the time domain.Frequency domain acoustic feature refers to mood voice
The feature for being used to reflect user emotion that data are shown on frequency domain.
In practical applications, the mood voice data of terminal acquisition is substantially a voice signal, comprising voice in frequency domain
With the feature in time domain.Shown in waveform diagram such as Fig. 2 (a) of the voice signal (alternatively referred to as time frequency signal), abscissa indicates frequency
Rate (frequency), ordinate indicate Oscillation Amplitude (abbreviation amplitude, amplitude).Terminal can in voice signal when
Between, many-sided feature extraction such as amplitude, frequency, to extract to obtain the time domain acoustic feature of voice in the time domain.Further terminal
Fourier Transform Algorithm can be used and convert voice signals into voice spectrum, as Fig. 2 (b) shows a kind of signal of voice spectrum figure
Figure.The voice spectrum figure refers to waveform diagram of the voice signal on frequency domain, alternatively referred to as spectrogram.I.e. the voice spectrum figure refers to
The time-domain signal of voice is converted to the waveform diagram embodied after frequency-region signal, when the abscissa in the voice spectrum figure indicates
Between, ordinate indicate frequency.The terminal situation that can change with time to different periods frequency-region signal in voice spectrum figure is divided
Analysis and identification, obtain the frequency domain acoustic feature that voice signal is shown on frequency domain.
Specifically, terminal can be used frequency domain character extraction algorithm and be analyzed voice spectrum figure to obtain frequency domain acoustics spy
Sign.The frequency domain character extraction algorithm concretely system customized setting, for acoustically extracting the frequency domain character of voice,
It may include but be not limited to convolutional neural networks algorithm, Recognition with Recurrent Neural Network algorithm etc..For example, terminal uses convolutional neural networks
Algorithm carries out local shape factor to voice spectrum figure, is such as shifted for voice spectrum figure, is scaled or other forms distortion
The image procossing of invariance obtains frequency domain acoustic feature.
S13, it calls the first mood model to carry out fusion recognition to text feature and acoustic feature, obtains target emotion.
Terminal calls the first mood model to carry out unified identification or fusion recognition to text feature and acoustic feature, obtains feelings
The target emotion that thread voice data is reflected.First mood model can be the customized setting of system, such as inclined according to user
Good or actual demand setting.First mood model is the model of preparatory trained user emotion for identification, can be wrapped
Include but be not limited to feedforward neural network (feed forward, FF), depth feedforward neural network (deep feed forward,
DFF), recurrent neural network (recurrent neural network, RNN), long memory network (long/short in short-term
Term memory, LSTM) or for Emotion identification model.
It should be noted that if the embodiment of the present invention does not consider that the accuracy of Emotion identification, terminal can only consider text spy
Sign or acoustic feature, the target emotion that the mood of corresponding identification is reflected as mood data.Without comprehensively considering text spy
The Emotion identification sought peace under acoustic feature collective effect is conducive to save terminal computing resource, promotes treatment effeciency.
In another embodiment, if mood data includes mood voice data, step S102 includes the following steps
S21-S25。
S21, mood voice data is converted to mood text data, call the second mood model to mood text data into
Row semantic analysis obtains the first mood.
After mood voice data is converted to mood text data by terminal, the second mood model can be called to mood textual data
According to semantic analysis is carried out, the first mood is obtained.Second mood model can be equally preparatory trained Emotion identification model, tool
Body can refer to the related introduction previously with regard to the first mood model, and which is not described herein again.
When it is implemented, terminal first carries out semantic analysis to mood text data by the second mood model, mood is obtained
The candidate mood vocabulary of the one or more for including in text data, candidate's mood vocabulary are used to reflect the mood of user, such as
Angry, indignation, irritated, happy, pleasure etc..Specifically, terminal can be according to mood dictionary existing in model to mood textual data
According to the analyses processing such as semantic analysis, such as crawl syntax rule, fractionation vocabulary are carried out, at least one candidate mood vocabulary is obtained.
The mood dictionary is the customized setting of system, such as can be language inquiry and word counting (linguistic inquiry
And word count, CLIWC) dictionary and EmoCD mood dictionary etc. include preconfigured at least one in the mood dictionary
It is a to refer to mood vocabulary.Optionally, each to be configured with correspondingly weight (weight be claimed) with reference to mood vocabulary.The weight is used for
Indicate the intensity with reference to the reflected mood of mood vocabulary, abbreviation emotional intensity.Such as the feelings reflected with reference to mood vocabulary
Thread intensity is bigger, then this with reference to mood vocabulary weight it is bigger;Conversely, it is smaller with reference to the emotional intensity that mood vocabulary is reflected,
Then this with reference to mood vocabulary weight it is smaller.
Further terminal can carry out similarity mode to the reference mood vocabulary in candidate mood vocabulary and model, calculate
To candidate mood vocabulary and with reference to the similarity between mood vocabulary, and then the mood that target emotion vocabulary is reflected is determined as
First mood.Wherein, target emotion vocabulary is the vocabulary for meeting the following conditions at least one candidate mood vocabulary: candidate mood
Similarity between vocabulary and reference mood vocabulary is greater than or equal to preset threshold (concretely third threshold value), and this refers to feelings
The weight of thread vocabulary is greater than or equal to the 4th threshold value.The concretely customized setting of system of the third threshold value and the 4th threshold value,
Such as liked according to user or actual demand customized setting, or the numerical value obtained according to a series of experiments data statistics
Deng.They can be equal, can not also wait, and the present invention is without limitation.
Similarity mode of the present invention, specific embodiment and without limitation.For example, terminal device can be used it is as follows
Any one of similarity mode algorithm (can also claim similarity calculation method) or multinomial combination are to calculate the similarity between vocabulary:
Word frequency (term frequency, TF) calculating method, word frequency-inverse file frequency (term frequency-inverse document
Frequency, TF-IDF) calculating method, conversion (word2Vec) calculating method of vocabulary to vector or other seek Lexical Similarity
Algorithm etc..
S22, it calls third mood model to carry out acoustic analysis to mood voice data, obtains the second mood.
Terminal can carry out acoustic character to mood voice data by third mood model, obtain mood voice data
In include acoustic feature.The acoustic feature has time domain acoustic feature and frequency domain acoustic feature according to frequency domain and temporal partitioning.Into
One step third mood model can be analyzed according to the time domain acoustic feature and/or frequency domain acoustic feature for including in mood voice data
Obtain the second mood that mood voice data is reflected.The present invention is hereafter special with comprehensive analysis time domain acoustic feature and frequency domain acoustics
For sign, the specific implementation for obtaining the second mood is described in detail.
Specifically, terminal can carry out feature extraction to mood voice data in the time domain by third mood model, obtain
The time domain acoustic feature for including in mood voice data.The time domain acoustic feature refers to that mood voice data wraps on time domain direction
The temporal signatures contained may include but be not limited to word speed, the duration of a sound, mel cepstrum coefficients (mel-scale frequency
Cepstral coefficients, MFCC), perception linear prediction (perceptual linear prediction, PLP), altogether
Shake peak or other time domain charactreristic parameters etc..Correspondingly, terminal can also be on frequency domain to mood voice number by third mood model
According to feature extraction is carried out, the frequency domain acoustic feature for including in mood voice data is obtained.The frequency domain acoustic feature refers to mood language
The frequency domain character that sound data are included on frequency domain direction may include but be not limited to short-time energy, short-time average amplitude, zero passage
Rate or other frequency domain character parameters etc..
Further third mood model can carry out comprehensive analysis to time domain acoustic feature and frequency domain acoustic feature, obtain second
Mood.For example, third mood model can analyze time domain acoustic feature and frequency domain acoustic feature respectively locating threshold interval range,
And then obtain mood corresponding to the threshold interval range.The third mood model concretely in advance trained Emotion identification
Model can be corresponded to reference to the related introduction previously with regard to the first mood model, and which is not described herein again.
Similarity between S23, the first mood of calculating and the second mood.
Terminal calculates the similarity between the first mood and the second mood using preset similarity calculation method, after being convenient for
The continuous target emotion for determining that mood voice data is reflected based on the similarity.Before reference can be corresponded to about the similarity calculation method
It states and illustrates which is not described herein again about the correlation of similarity mode algorithm.
S24, when similarity be greater than or equal to first threshold when, the first mood or the second mood are determined as target emotion.
S25, when similarity be less than first threshold when, the first mood is determined as target emotion.
Terminal is if it is determined that be greater than or equal to first threshold to the similarity between the first mood and the second mood, then it is assumed that the
One mood and the second mood are more close, for example, the first mood be it is happy, the second mood be pleasure.Terminal can be by the first mood or
Two moods are determined as target emotion.
Conversely, if similarity between the first mood and the second mood is less than first threshold, then it is assumed that the first mood and the
Two moods differ greatly or conflicting, such as the first mood is pleasure, and the second mood is agitation.For the standard for guaranteeing Emotion identification
True property, terminal can select to obtain target emotion from the first mood and the second mood.For example, terminal can be from the first mood and second
Mood of choosing any one kind of them in mood is as target emotion.For another example in Emotion identification, the accuracy of usual text semantic analysis is higher than sound
Signature analysis is learned, therefore the first mood that terminal can obtain semantic analysis is determined as target emotion.
It should be noted that if not considering the accuracy of Emotion identification, terminal can only consider text semantic analysis or voice
Acoustic analysis, the target emotion that the mood of corresponding identification is reflected as mood data, without comprehensively considering text semantic (text
Eigen) and Speech acoustics (acoustic feature) are conducive to promote computational efficiency to save the computing resource of terminal.
The present embodiments relate to mood (target emotion) division granularity without limitation.For example, drawing when mood
When dividing granularity larger, mood description is fuzzyyer, such as mood only has point positively and negatively, can be divided into active mood and passiveness
Mood.Conversely, mood description is more accurate when the division granularity of mood is smaller.Wherein, may include in the mood of bulky grain degree
The mood of several small particle sizes.
For example, as Fig. 3 shows a kind of schematic diagram that mood divides.As divided the mood there are three types of granularity in Fig. 3,
Mood comprising the first level, the second level and third level.Wherein, the first level is divided according to the positive negative sense of mood,
It include active mood and negative feeling.Second level is after dividing along positive negative sense mood, and every kind of mood includes several
The mood of band strength.As schemed, includes enjoyment mood and positive expectations in active mood, include irritated mood in negative feeling
With dislike mood.Further third level be each mood for including in the second level is done further fine, discrete landearth-circumstance
Thread divides.Include in high spirits, pleasant in enjoyment mood and amusement such as in diagram, in positive expectations comprising being hopeful and
It is expected that.It include perturbed, discontented and dejected in irritated mood.Disliking in mood includes to disdain and dislike.
In actual treatment, since the granularity of text semantic analysis acquisition mood and Speech acoustics analysis obtain mood
Granularity may not be identical, then terminal can further be verified or micronized particles degree in the way of the lesser Emotion identification of granularity
Lesser Emotion identification mode.If such as text semantic analysis obtains the granularity of mood greater than Speech acoustics analysis acquisition mood
Granularity, i.e. the mood granularity of the second mood model is greater than the mood granularity of third mood model, then terminal is being called
After the first mood that second mood model is reflected based on text semantic identification mood data, third mood mould can be further called
The second mood that type is reflected based on acoustic analysis identification mood data, further to verify or refine the first mood.It is convenient for
The subsequent target emotion for more precisely obtaining mood data and being reflected, to promote the accuracy of Emotion identification.
As described above, if the similarity between the first mood and the second mood is greater than or equal to first threshold, terminal
It is believed that the first mood and the second mood belong to same type of emotion.Due to the second mood granularity less than the first mood
Granularity, mood described in the second mood is finer, then the first mood can be determined as the target that mood data is reflected by terminal
Mood.Conversely, if similarity between the first mood and the second mood is less than first threshold, terminal thinks the first mood and the
Two moods are not belonging to same class type of emotion, and prompt information can be transmitted at this time, prompt the user whether the first mood being determined as feelings
The target emotion that thread data are reflected.Be conducive to be promoted the accuracy of Emotion identification in this way, while user being allowed to have sense of participation, favorably
In promotion user experience.
In another embodiment, if mood data includes mood text data, terminal can to mood text data into
Row semantic analysis obtains the target emotion that the mood text data is reflected, can specifically correspond to reference to abovementioned steps S11 or S21
Specific embodiment, which is not described herein again.
In another embodiment, if mood data includes mood image data, step S102 includes the following steps
S31-S35。
Target face expression in S31, extraction mood image data, and obtain the third feelings that target face expression is reflected
Thread.
Terminal can be used face recognition algorithms and carry out recognition of face to mood image data, obtain wrapping in mood image data
The third mood that the target face expression and the target face expression contained is reflected.The face recognition algorithms can be made by oneself for system
Justice setting, it may include but be not limited to face Emotion identification algorithm, Local Features Analysis algorithm, feature based on geometrical characteristic
Face algorithm, neural network algorithm etc..
By taking the face Emotion identification algorithm based on geometrical characteristic as an example, geometrical characteristic is can be used to mood image data in terminal
Recognition of face is carried out, such as usually extracts the important features organs such as human eye, mouth, nose, dimple as characteristic of division, obtains mood
The facial image for including in image data.Can further Expression Recognition be carried out to facial image, obtaining the facial image includes
Target face expression, and then obtain the third mood of target face expression reflection.If such as target face expression be smile,
The third mood that the target face expression is reflected is happy etc..
Target limbs behavior in S32, extraction mood image data, and obtain the target limbs behavior is reflected the 4th
Mood.
Terminal can be used Activity recognition algorithm and carry out Activity recognition to mood image data, obtain in the mood image data
The 4th mood that the behavior of target limbs and the target limbs behavior for including are reflected.Behavior recognizer can be instruction in advance
It perfects, may include but be not limited to the human body behavior algorithm based on deep learning, the human body behavior based on convolutional neural networks
Algorithm etc..
Optionally, the output result of Activity recognition algorithm may be either the target limbs behavior for including in mood image data,
The 4th mood that can also be reflected for the target limbs behavior.When the output of Activity recognition algorithm is the result is that in mood image data
When the target limbs behavior for including, since different limbs behaviors can correspond to different moods, terminal need to also be from limbs feelings
Corresponding the 4th the reflected mood of the target limbs behavior is obtained in thread mapping table.Wherein, it is wrapped in limbs mood mapping table
Include the mapping relations between one or more groups of limbs behaviors and mood, every kind of limbs behavior corresponds to a kind of mood, and a kind of mood can
Corresponding one or more limbs behaviors.For example, as the following table 1 shows a kind of schematic diagram of limbs mood mapping table.
Table 1
Serial number | Limbs behavior | User emotion |
1 | Limbs behavior 1 | Happily |
2 | Limbs behavior 2 | Indignation |
....... | ....... | ...... |
Similarity between S33, calculating third mood and the 4th mood.
S34, when the similarity between third mood and the 4th mood is greater than or equal to second threshold, by third mood or
4th mood is determined as target emotion.
S35, when the similarity between third mood and the 4th mood be less than second threshold when, third mood is determined as mesh
Mark mood.
Terminal calculates the similarity between third mood and the 4th mood using similarity calculation method.When third mood and
Similarity between four moods is less than or equal to second threshold, then it represents that third mood and the 4th mood are more similar, at this time terminal
The target emotion that third mood or the 4th mood can be reflected as mood data.
Conversely, when the similarity between third mood and the 4th mood is greater than second threshold, then it represents that third mood and the
Four moods differ greatly, and terminal can obtain target emotion from third mood and the 4th mood according to default decision rule at this time.
The default decision rule is the customized setting of system, such as the third mood that facial expression is reflected directly is determined as target
Mood etc..
Optionally, if not considering the accuracy of Emotion identification, the third mood for being reflected target face expression can be held eventually
Or the 4th mood that target limbs behavior is reflected is determined as target emotion.Without comprehensively considering facial expression and limbs behavior
It analyzes target emotion, the computing resource of terminal can be saved in this way, promote treatment effeciency.
Optionally in actual process, since the granularity of facial expression reflection mood and limbs behavior reflect mood
Granularity division may not be identical, terminal can be verified or micronized particles in the way of the lesser Emotion identification of granularity at this time
Biggish Emotion identification mode is spent, can be corresponded to reference to the related introduction in previous embodiment, which is not described herein again.
It should be noted that the present embodiments relate to several specific embodiments can be used alone, can also be a variety of
It is used in combination.Such as if mood data, when including mood voice data and mood image data, terminal can be in combination with mood language
Sound data and the respective Emotion identification mode of mood image data, comprehensive analysis obtain the target emotion that mood data is reflected,
It can similarly correspond to reference to the specific embodiment for obtaining target emotion previously for mood voice data and mood image data, this
In repeat no more.
S103, according to target emotion be the corresponding target filter mode of images match to be processed.
After terminal obtains target emotion, mood filter mapping table can be obtained, is further obtained from the mood filter mapping table
Take target filter mode corresponding with target emotion.Wherein, which can be the local for being pre-configured in terminal
In database, it can also be configured in remote server.It include one or more groups of moods and filter in the mood filter mapping table
Mapping relations between mode, every kind of mood correspond to a kind of filter mode, and a kind of filter mode can correspond to one or more moods.
Illustratively, as the following table 2 shows a kind of schematic diagram of mood filter mapping table.
Table 2
S104, filter processing is carried out to image to be processed using target filter mode, obtains target image.
Figure of the embodiment of the present invention due to formats such as the images, such as JPG, PNG that image to be processed is usually after encoding
Picture, terminal need to be decoded image to be processed, obtain decoded image.Then, terminal passes through central processing unit
(central processing unit, CPU) carries out filter and rendering to decoded image using target filter mode, obtains
To target image.It can be seen that terminal carries out mood filter for image to be processed, is conducive to promote image enhancement effects, keep away
Exempt from the problems such as the language fails to express the meaning, can not accurately express user's true intention occur.
Fig. 4 is referred to, is the flow diagram of another image processing method provided in an embodiment of the present invention.Such as Fig. 4 institute
The method shown includes step S401-S405.
S401, it is instructed in response to the dynamic release in social application, acquires mood data.
If including mood image data in S402, mood data, mood image data is determined as to image to be processed.
If terminal detects the dynamic release instruction for social application, dynamic release instruction can be responded, feelings are acquired
Thread data.The mood data can be the mood data of designated user, can also be the mood number in designated user at the appointed time section
According to.Wherein, mood data can be corresponded to reference to described above, concretely at least one of the following: mood voice data, mood
Image data, mood text data and mood text data.The designated time period can be set by the user himself or system default,
Such as 60 seconds (s) etc..Designated user can be any user, and illustratively terminal can carry out audio recording to designated user, obtain
To mood voice data;Track up is carried out to designated user, obtains mood image data etc..
If including mood image data in mood data, terminal can be directly using mood image data as figure to be processed
Picture avoids user from inputting image to be processed again, reduces user's operation, is conducive to the efficiency for promoting image procossing.
The target emotion that S403, the identification mood data are reflected;
S404, according to the target emotion be the corresponding target filter mode of the images match to be processed;
S405, filter processing is carried out to the image to be processed using the target filter mode, obtains target image.
Optionally, terminal can also issue the target image in social application, for user's access.
Fig. 5 is referred to, is the flow diagram of another image processing method provided in an embodiment of the present invention.Such as Fig. 4 institute
The method shown includes step S501-S505.
S501, it is instructed in response to the dynamic release in social application, acquires mood data.
If terminal detects the dynamic release instruction for social application, dynamic release instruction can be responded, feelings are acquired
Thread data.It can correspond to about dynamic release instruction and mood data with reference to described above, which is not described herein again.
S502, image to be processed is obtained according to dynamic release instruction.
If carrying pending image in the instruction of dynamic release of the embodiment of the present invention, terminal can be directly dynamic by parsing
State publication instruction obtains image to be processed.Alternatively, but being used to indicate if not carrying image to be processed in dynamic release instruction
Image to be processed is obtained, then the instruction that terminal can be instructed according to dynamic release obtains image to be processed, the figure to be processed
As that can be inputted for user, can also being sent for other equipment (such as server).
Optionally, terminal can be transmitted prompting message, prompt the user whether to input figure to be processed after acquiring mood data
Picture.The embodiment of the prompting message without limitation, such as by modes such as pop-up (suspension windows), short message, subtitle, pictures mentions
Show whether user selects to input image to be processed.
The target emotion that S503, the identification mood data are reflected.
S504, according to the target emotion be the corresponding target filter mode of the images match to be processed.
S505, filter processing is carried out to the image to be processed using the target filter mode, obtains target image.
For example, by taking social application is echo application as an example.A kind of acquisition mood number is shown referring to Fig. 6 (a)-Fig. 6 (e)
According to the schematic diagram of a scenario with image to be processed.Such as Fig. 6 (a), user enables echo application, uses boundary into what echo was applied
Face, it is specific as shown in Fig. 6 (b).User is issuing dynamic using selection in interface, into audio recording circle shown in Fig. 6 (c)
Face.User's long-pressing audio recording key records the mood data of designated time period for the user, here as mood voice number
According to.It recorded 38 seconds mood datas as Fig. 6 (d) is shown.Further user clicks next button and (is illustrated as being used to indicate down
The icon of single stepping, the icon are specially the circular icon for including is-greater-than symbol), into interface shown in Fig. 6 (e), user
It is used as the image to be processed of dynamic release needed for actively choosing.It optionally, can also new display between Fig. 6 (d)-Fig. 6 (e)
Prompting interface (not shown) prompts the user whether to select image to be processed, if to detect that user need to select to be processed for terminal
Image then jumps to Fig. 6 (e) and selects to input image to be processed for user.
It should be noted that can correspond to and implement with reference to 1 the method for earlier figures about the content not described in Fig. 4 and Fig. 5
Description in example, which is not described herein again.
Optionally terminal can also carry out the step S701-S705 in Fig. 7 after obtaining target image.
If including mood voice data or mood text data in S701, mood data, by mood voice data or mood
Text data is synthesized in target image, obtains composograph.
In one embodiment, if including mood voice data in mood data, terminal can be by mood voice data
Correspondingly mood text data is converted to, mood text data is further added to target image in a manner of image subtitle
In, to obtain composograph.Wherein, mood text data is added to the specific location in target image and without limitation, example
The upper left corner of target image, the upper right corner can be such as added to or occupy middle position.About mood voice data to mood text data
Conversion can correspond to reference to the related introduction in previous embodiment, which is not described herein again.
In another embodiment, if including mood voice data in mood data, terminal can be by mood voice data
It is embedded into target image, obtains composograph.
In another embodiment, if mood data directly includes mood text data, terminal can be by the mood text
Data are added in target image in a manner of image subtitle, to obtain composograph.
In another embodiment, if mood data includes mood text data, terminal can turn mood text data
It is changed to correspondingly mood voice data, which is embedded into target image, to obtain composograph.Its
In, the conversion regime of mood text data to mood voice data and without limitation, such as terminal use pre-configured sound
Sound mode (such as child's voice, soprano) plays mood lteral data, to form correspondingly mood voice data.
It should be noted that the embodiment that above-mentioned several composographs obtain can be used alone, also may be used in combination.
Such as mood voice data can both be embedded into target image by terminal, it can also be by the corresponding mood textual data of mood voice data
According to being added in target image, to obtain comprising voice and the composograph of text data etc..
S702, composograph is issued in social application.
The terminal of that embodiment of the invention can be issued further in response to image and be instructed, and composograph is issued in social application.
Image publication instruction can refer to the same instruction with dynamic release instruction above, also can refer to different instructions, the present invention is not
It limits.When they are different instruction, image publication refers to that terminal detects and generate when image publication operates
Instruction, which refers to the instruction of generation when terminal detects progress data acquisition operations, for acquiring mood data
And/or image to be processed etc..Wherein, image publication operation is the operation of the customized setting of system, such as clicks publication button
Deng;Correspondingly data acquisition operations can also be operation of the customized setting of system, such as click voice recording button etc..
For example, with reference to the mood data and image to be processed acquired in 6 (a)-Fig. 6 (e) of earlier figures example,
Assuming that the target emotion that terminal recognition mood data is reflected is happy.It shows in social application and sends out referring to Fig. 6 (f)-Fig. 6 (h)
The concrete scene schematic diagram of cloth composograph.Specifically, terminal is after the target emotion that is reflected of identification mood data is happy,
Filter processing can be carried out to image to be processed according to happy corresponding target filter mode, it can be to be processed as shown in Fig. 6 (f)
Image on the happy expression of rendering smiling face, obtain target image.And terminal can also be by mood data (the mood voice of recording
Data) it is embedded into target image, composograph is obtained, it is specific such as Fig. 6 (g).Further user can click in echo application
Button is issued, the composograph is issued in echo application, as shown in Fig. 6 (h).
S703, operation is checked in response to being directed to the first of composograph, show the target image in composograph.
If terminal, which is detected, checks operation for the first of composograph, responds this and first check operation, in display screen
The target image for including in middle display composograph.This first checks that operation is system customized setting, such as according to product
Demand or the customized setting of user preference.For example, user browses composograph in social application, it is somebody's turn to do if terminal detects to be directed to
When the browse operation of composograph, can show the target image for including in composograph, play mood voice data or
The mood text data etc. for including in display composograph.
S704, operation is checked in response to being directed to the second of composograph, show the target image in composograph and broadcasting
Target speech data, wherein the target speech data can be the mood voice data for including in mood data, can also be mood
The voice data for the converted acquisition of mood text data for including in data.
No matter in mood data including mood voice data and/or mood text data, if terminal is detected for synthesis
The second of image checks operation, then responds this and second check operation, the target image in composograph is shown in display screen, and
Play target speech data.Wherein, if including mood voice data in mood data, which can directly be feelings
Thread voice data.If in mood data including mood text data, which can be corresponding for mood text data
The voice data of conversion.If mood data includes mood text data and mood voice data, for the true of accurate reception and registration user
It is intended to, which can be mood voice data.It can also be the mood text data corresponding conversion of system default setting
Voice data etc..
Second checks that operation equally can be the customized setting of system, and checks that operation is not identical with first.Such as terminal
If detecting the double click operation for composograph in social application, terminal can enter the mesh in full screen display composograph
Logo image, and play the mood voice data in composograph.
Optionally, terminal respond this second check operation after, can also simultaneous display target text data, the target text number
It can also be corresponding turn of mood voice data for including in mood data according to that can be the mood text data for including in mood data
The circumferential edge changed.The viewing experience for advantageously ensuring that user in this way promotes the utilization rate of social application.
S705, operation is checked in response to the third for composograph, show target image and target in composograph
Text data, the target text data can for include in mood data mood text data or mood data in include
The text data of mood voice data corresponding conversion.
If terminal detects that the third for composograph checks operation, responds the third and check operation, in display screen
Target image and target text data in upper display composograph.Wherein, if including textual data of being in a bad mood in mood data
According to then the target text data can directly be mood text data.If in mood data including mood voice data, the target
Text data can be the text data of mood voice data corresponding conversion.If in mood data including mood voice data and mood
Text data, then to save terminal resource, which can be mood text data.It optionally, can also be mood language
The text data of sound data corresponding conversion, the present invention is without limitation.
Third, which checks operation equally, to be the customized setting of system, it checks that operation and second checks that operation is equal with first
It is not identical.If such as terminal can be shown in composograph when detecting the clicking operation for composograph in social application
Target image, and simultaneous display mood text data etc..
In practical applications, any one or more steps in step S703-S705 can be performed in terminal.When terminal is executable
When multiple steps, without limitation, such as terminal can first carry out step S705 to the execution sequencing of each step, execute step afterwards
S703。
The terminal of social activity of the embodiment of the present invention may include smart phone (such as Android phone, IOS mobile phone), individual
Computer, tablet computer, palm PC, mobile internet device (mobile internet devices, MID) or wearable intelligence
Internet devices, the embodiment of the present invention such as energy equipment are not construed as limiting.
By the implementation embodiment of the present invention, can be presented by the content of more sense organs, such as combined by sound and vision
Mode presents the image of sound or text, allows user that can show publication content more acurrate in social application, richerly,
It may advantageously facilitate the interest, interactivity and utilization rate of social application.And enhance publication content (figure based on Emotion identification
Picture), also solve the problems such as image enhancement effects present in traditional technology are bad, the true intention for the user that is beyond expression.
Fig. 8 is referred to, is that a kind of process of image processing method based on scene application provided in an embodiment of the present invention is shown
It is intended to.Method as shown in Figure 8 includes step S801-S803.
S801, it is instructed in response to the dynamic release in social application, obtains mood data and image to be processed.
If the terminal of that embodiment of the invention detects that the dynamic release in social application instructs, may be in response in the social application
Dynamic release instruction, obtain mood data and target image.Specifically terminal can obtain mood data and image to be processed,
The image to be processed is handled to obtain target image based on mood data, the acquisition about target image can correspond to reference
Description in earlier figures 1, Fig. 4 and Fig. 5 either method embodiment, details are not described herein again.
Wherein, dynamic release instruction can be detected that user carries out dynamic release and operate to generate in social application by terminal
Instruction, dynamic release operation can be the clicking operation in social application for specified dynamic release key, slide
Deng.The social application refers to the software for reaching user's communication dealing purpose by network, may include but be not limited to blog class and answers
It is (such as micro- with the application of, microblogging class, the application of forum's class, social networks class application (such as facebook) and the application of instant messaging class
Letter, QQ etc.) etc..
The target emotion that S802, identification mood data are reflected, and it is corresponding with target emotion for images match to be processed
Target filter mode.
S803, filter processing is carried out to image to be processed using target filter mode, obtains target image.
S804, target image is issued in social application.
Optionally, the interest and integrality, enhancing user social contact, terminal for considering dynamic (image) publication are contemplated that feelings
Thread data are synthesized with target image, obtain composograph, to issue composograph in social application.Specifically:
In one embodiment, if only including mood image data in mood data, terminal responds the dynamic release
Instruction, can issue target image in social application.
It, can if including mood voice data or mood text data, terminal in mood data in another embodiment
Mood voice data or mood text data are synthesized in target image, composograph is obtained.And then it is sent out in social application
The cloth composograph completes corresponding dynamic publication.Elaboration about composograph can be corresponded to referring to Fig. 7 the method embodiment
In it is related illustrate, details are not described herein again.About issuing dynamic scene application example in social application, before reference being corresponded to
The related introduction of Fig. 6 (a)-Fig. 6 (h) embodiment of the present invention is stated, user in social application sequentially complete in social application by operation
The publication of composograph, details are not described herein again.
By the implementation embodiment of the present invention, can be presented by the content of more sense organs, such as combined by sound and vision
Mode presents the image of sound or text, allows user that can show publication content more acurrate in social application, richerly,
It may advantageously facilitate the interest, interactivity and utilization rate of social application.And enhance publication content (figure based on Emotion identification
Picture), also solve the problems such as image enhancement effects present in traditional technology are bad, the true intention for the user that is beyond expression.
Based on the description of above-mentioned image processing method embodiment, a kind of image processing apparatus is also disclosed in the embodiment of the present invention,
The device can be operate in a computer program (including program code) in terminal.The device can be executed such as figure 1 above-
Content described in any one of Fig. 8 embodiment.Fig. 9 is referred to, which can run such as lower unit:
Acquiring unit 801, for obtaining mood data and image to be processed, the mood data includes mood voice number
According to, mood image data or mood text data;
Recognition unit 802, the target emotion that the mood data is reflected for identification;
Matching unit 803, for being the corresponding target filter of the images match to be processed according to the target emotion
Mode;
Processing unit 804 is obtained for carrying out filter processing to the image to be processed using the target filter mode
To target image.
In one embodiment, acquiring unit 801 is specifically used for instructing in response to the dynamic release in social application, adopts
Collect mood data;If in the mood data including mood image data, the mood image data is determined as described wait locate
The image of reason.
In another embodiment, acquiring unit 801 is specifically used for instructing in response to the dynamic release in social application, adopts
Collect mood data;Image to be processed is obtained according to dynamic release instruction.
In another embodiment, if processing unit 804 be also used in the mood data include mood voice data or
The mood voice data or mood text data are synthesized in the target image, obtain composite diagram by mood text data
Picture;The composograph is issued in the social application.
In another embodiment, processing unit 804 is also used in response to checking behaviour for the first of the composograph
Make, shows the target image in the composograph;Alternatively, being shown in response to checking operation for the second of the composograph
Show the target image in the composograph and play target speech data, the target speech data is the mood voice number
According to or the corresponding voice data of the mood text data;Alternatively, checking behaviour in response to the third for the composograph
Make, show the target image and target text data in the composograph, the target text data are the mood text
Data or the corresponding text data of the mood voice data.
In another embodiment, matching unit 803 is specifically used for obtaining mood filter mapping table, the mood filter
The mapping relations being in a bad mood with filter mode are recorded in mirror mapping table, the mapping relations are that a kind of filter mode is corresponding extremely
A kind of few mood;The target filter mould corresponding with the target emotion is obtained from the mood filter mapping table
Formula.
In another embodiment, if mood data includes mood voice data, recognition unit 802 is specifically used for institute
It states mood voice data and is converted to corresponding mood text data, and extract the spy of the text in the corresponding mood text data
Sign;Extract the acoustic feature in the mood voice data;Call the first mood model to the text feature and the acoustics
Feature carries out fusion recognition, obtains the target emotion.
In another embodiment, if mood data includes mood voice data, recognition unit 802 is specifically used for institute
It states mood voice data and is converted to corresponding mood text data, call the second mood model to the corresponding mood textual data
According to semantic analysis is carried out, the first mood is obtained;Third mood model is called to carry out acoustic feature point to the mood voice data
Analysis, obtains the second mood;When the similarity between first mood and second mood is greater than or equal to first threshold,
First mood or second mood are determined as the target emotion;When first mood and second mood
Between similarity be less than first threshold when, first mood is determined as target emotion.
In another embodiment, if mood data includes mood image data, recognition unit 802 is specifically used for extracting
Target face expression in the mood image data, and obtain the third mood that the target face expression is reflected;It extracts
Target limbs behavior in the mood image data, and obtain the 4th mood that the target limbs behavior is reflected;Work as institute
When stating similarity between third mood and the 4th mood and being greater than or equal to second threshold, by the third mood or described
4th mood is determined as the target emotion;When the similarity between the third mood and the 4th mood is less than the second threshold
When value, the third mood is determined as the target emotion.
In another embodiment, recognition unit 802 is specifically used for carrying out semantic analysis to the mood text data, obtains
To at least one candidate mood vocabulary;The reference mood word that will include in the candidate mood vocabulary and first mood model
It converges and carries out similarity mode, obtain the candidate mood vocabulary and the similarity with reference between mood vocabulary;By target feelings
The mood that thread vocabulary is reflected is determined as first mood;Wherein, the target emotion vocabulary is at least one described candidate
Meet the similarity in mood vocabulary more than or equal to third threshold value, and the weight with reference to mood vocabulary is greater than or equal to
Vocabulary corresponding to 4th threshold value, the weight with reference to mood vocabulary are used to indicate the feelings reflected with reference to mood vocabulary
The intensity of thread.
In another embodiment, recognition unit 802 is specifically used in the time domain carrying out the mood voice data special
Sign is extracted, and time domain acoustic feature is obtained;Feature extraction is carried out to the mood voice data on frequency domain, it is special to obtain frequency domain acoustics
Sign;The time domain acoustic feature and the frequency domain acoustic feature are analyzed, second mood is obtained.
According to another embodiment of the invention, each unit in image processing apparatus shown in Fig. 9 can respectively or
All one or several other units are merged into constitute or some (a little) unit therein can also be split as function again
Smaller multiple units are constituted on energy, this may be implemented similarly to operate, and the technology without influencing the embodiment of the present invention is imitated
The realization of fruit.Said units are logic-based function divisions, and in practical applications, the function of a unit can also be by multiple
Unit is realized or the function of multiple units is realized by a unit.In other embodiments of the invention, based at image
Managing device also may include other units, and in practical applications, these functions can also be assisted to realize by other units, and can
It is realized with being cooperated by multiple units.
It according to another embodiment of the invention, can be by including central processing unit (CPU), random access memory
It is transported on the universal computing device of such as computer of the processing elements such as medium (RAM), read-only storage medium (ROM) and memory element
Row is able to carry out such as the computer program (including program code) for letting off each step involved in embodiment any in Fig. 1-Fig. 8,
Construct image processing apparatus equipment as shown in Figure 9, and come the image processing method of realizing the embodiment of the present invention.It is described
Computer program can be recorded in such as computer readable recording medium, and be loaded by computer readable recording medium
It states and calculates in equipment, and run wherein.
The embodiment of the present invention can obtain mood data and image to be processed, and identify the mesh that the mood data is reflected
Mood is marked, for the corresponding target filter mode of the images match to be processed, finally to use institute according to the target emotion
It states target filter mode and filter processing is carried out to the image to be processed, obtain target image.In this way based on mood to image
Filter processing is carried out, can solve that image enhancement effects present in traditional technology are poor, can not accurately express the true meaning of user
The problems such as figure and influence interaction enthusiasm.
Description based on above method embodiment and Installation practice, the embodiment of the present invention also provide a kind of terminal.Please
Referring to Figure 10, which includes at least processor 901, input equipment 902, output equipment 903 and computer storage medium
904.Wherein, the processor 901 in terminal, input equipment 902, output equipment 903 and computer storage medium 904 can pass through
Bus or other modes connection.
Computer storage medium 904 can store in the memory of terminal, and the computer storage medium 904 is for depositing
Computer program is stored up, the computer program includes program instruction, and the processor 901 is situated between for executing the computer storage
The program instruction that matter 904 stores.Processor 901 (or CPU (Central Processing Unit, central processing unit)) is
The calculating core and control core of terminal, are adapted for carrying out one or more instruction, be particularly adapted to load and execute one or
A plurality of instruction is to realize correlation method process or corresponding function;In one embodiment, processing described in the embodiment of the present invention
Device 901 can be used for carrying out a series of image procossing, comprising: obtain mood data and image to be processed;Identify the feelings
The target emotion that thread data are reflected;It is the corresponding target filter mould of the images match to be processed according to the target emotion
Formula;Filter processing is carried out to the image to be processed using the target filter mode, obtains target image, etc..
The embodiment of the invention also provides a kind of computer storage medium (Memory), the computer storage medium is eventually
Memory device in end, for storing program and data.It is understood that computer storage medium herein both may include
Built-in storage medium in terminal, naturally it is also possible to the expansion storage medium supported including terminal.Computer storage medium mentions
For memory space, which stores the operating system of terminal.Also, it is also housed in the memory space and is suitable for being located
One or more instruction that reason device 901 is loaded and executed, these instructions can be one or more computer program
(including program code).It should be noted that computer storage medium herein can be high speed RAM memory, it is also possible to
Non-labile memory (non-volatile memory), for example, at least a magnetic disk storage;Optionally can also be to
Few one is located remotely from the computer storage medium of aforementioned processor.
In one embodiment, it can be loaded by processor 901 and execute one stored in computer storage medium or more
Item instruction, to realize the above-mentioned corresponding steps in relation to the method in image procossing embodiment;In the specific implementation, computer storage is situated between
One or more instruction in matter is loaded by processor 901 and executes following steps:
It obtains mood data and image to be processed, the mood data includes mood voice data, mood image data
Or mood text data;
Identify the target emotion that the mood data is reflected;
It is the corresponding target filter mode of the images match to be processed according to the target emotion;
Filter processing is carried out to the image to be processed using the target filter mode, obtains target image.
In further embodiment, one or more instruction is loaded by processor 901 and is specifically executed: in response to social activity
Dynamic release instruction in, acquires mood data;If including mood image data in the mood data, by the mood
Image data is determined as the image to be processed.
In further embodiment, one or more instruction is loaded by processor 901 and is specifically executed: in response to social activity
Dynamic release instruction in, acquires mood data;Image to be processed is obtained according to dynamic release instruction.
In further embodiment, one or more instruction can also be loaded and be executed by processor 901: if the mood
Include mood voice data or mood text data in data, the mood voice data or mood text data are synthesized to institute
It states in target image, obtains composograph;The composograph is issued in the social application.
In further embodiment, one or more instruction can also be loaded and be executed by processor 901: in response to being directed to
The first of the composograph checks operation, shows the target image in the composograph.Alternatively, in response to being directed to the conjunction
Operation is checked at the second of image, is shown the target image in the composograph and is played target speech data, the target
Voice data is the mood voice data or the corresponding voice data of the mood text data.Alternatively, in response to being directed to
The third of the composograph checks operation, shows target image and target text data in the composograph, the mesh
Marking text data is the mood text data or the corresponding text data of the mood voice data.
In further embodiment, one or more instruction is loaded by processor 901 and is specifically executed: obtaining mood filter
Mirror mapping table records the mapping relations being in a bad mood with filter mode, the mapping in the mood filter mapping table
Relationship is a kind of corresponding at least one mood of filter mode;It is obtained and the target feelings from the mood filter mapping table
The corresponding target filter mode of thread.
In further embodiment, one or more instruction is loaded by processor 901 and is specifically executed: by the mood
Voice data is converted to corresponding mood text data, and extracts the text feature in the corresponding mood text data;It mentions
Take the acoustic feature in the mood voice data;Call the first mood model to the text feature and the acoustic feature into
Row fusion recognition obtains the target emotion.
In further embodiment, one or more instruction is loaded by processor 901 and is specifically executed: by the mood
Voice data is converted to corresponding mood text data, and the second mood model is called to carry out the corresponding mood text data
Semantic analysis obtains the first mood;It calls third mood model to carry out acoustic character to the mood voice data, obtains
Second mood;It, will be described when the similarity between first mood and second mood is greater than or equal to first threshold
First mood or second mood are determined as the target emotion;When between first mood and second mood
When similarity is less than first threshold, first mood is determined as target emotion.
In further embodiment, one or more instruction is loaded by processor 901 and is specifically executed: extracting the feelings
Target face expression in thread image data, and obtain the third mood that the target face expression is reflected;Extract the feelings
Target limbs behavior in thread image data, and obtain the 4th mood that the target limbs behavior is reflected;When the third
When similarity between mood and the 4th mood is greater than or equal to second threshold, by the third mood or the 4th feelings
Thread is determined as the target emotion;When the similarity between the third mood and the 4th mood is less than second threshold,
The third mood is determined as the target emotion.
In further embodiment, one or more instruction is loaded by processor 901 and is specifically executed: to the mood
Text data carries out semantic analysis, obtains at least one candidate mood vocabulary;By the candidate mood vocabulary and first feelings
The reference mood vocabulary that includes in thread model carries out similarity mode, obtains the candidate mood vocabulary and described with reference to mood word
Similarity between remittance;The mood that target emotion vocabulary is reflected is determined as first mood;Wherein, the target emotion
Vocabulary is greater than or equal to third threshold value to meet the similarity at least one described candidate mood vocabulary, and described with reference to feelings
The weight of thread vocabulary is greater than or equal to vocabulary corresponding to the 4th threshold value, described in the weight with reference to mood vocabulary is used to indicate
With reference to the intensity for the mood that mood vocabulary is reflected.
In further embodiment, one or more instruction is loaded by processor 901 and is specifically executed: right in the time domain
The mood voice data carries out feature extraction, obtains time domain acoustic feature;The mood voice data is carried out on frequency domain
Feature extraction obtains frequency domain acoustic feature;The time domain acoustic feature and the frequency domain acoustic feature are analyzed, institute is obtained
State the second mood.
The embodiment of the present invention can obtain mood data and image to be processed, and identify the mesh that the mood data is reflected
Mood is marked, for the corresponding target filter mode of the images match to be processed, finally to use institute according to the target emotion
It states target filter mode and filter processing is carried out to the image to be processed, obtain target image.In this way based on mood to image
Filter processing is carried out, can solve that image enhancement effects present in traditional technology are poor, can not accurately express the true meaning of user
The problems such as figure and influence interaction enthusiasm.The above, the above embodiments are merely illustrative of the technical solutions of the present invention, rather than right
It is limited;Although the present invention is described in detail referring to the foregoing embodiments, those skilled in the art should manage
Solution: it is still possible to modify the technical solutions described in the foregoing embodiments, or to part of technical characteristic into
Row equivalent replacement;And these are modified or replaceed, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution
The range of scheme.
Claims (15)
1. a kind of image processing method, which is characterized in that the described method includes:
It obtains mood data and image to be processed, the mood data includes mood voice data, mood image data or feelings
Thread text data;
Identify the target emotion that the mood data is reflected;
It is the corresponding target filter mode of the images match to be processed according to the target emotion;
Filter processing is carried out to the image to be processed using the target filter mode, obtains target image.
2. the method according to claim 1, wherein the acquisition mood data and image to be processed include:
In response to the dynamic release instruction in social application, mood data is acquired;
If including mood image data in the mood data, the mood image data is determined as the figure to be processed
Picture.
3. the method according to claim 1, wherein the acquisition mood data and image to be processed include:
In response to the dynamic release instruction in social application, mood data is acquired;
Image to be processed is obtained according to dynamic release instruction.
4. according to the method in claim 2 or 3, which is characterized in that the method also includes:
If in the mood data including mood voice data or mood text data, by the mood voice data or mood text
Notebook data is synthesized in the target image, obtains composograph;
The composograph is issued in the social application.
5. according to the method described in claim 4, it is characterized in that, the method also includes:
In response to checking operation for the first of the composograph, the target image in the composograph is shown;Alternatively,
In response to checking operation for the second of the composograph, shows the target image in the composograph and play mesh
Voice data is marked, the target speech data is the language of the mood voice data or the mood text data corresponding conversion
Sound data;Alternatively,
Operation is checked in response to the third for the composograph, shows target image and target text in the composograph
Notebook data, the target text data are the text of the mood text data or the mood voice data corresponding conversion
Data.
6. method according to any one of claims 1-5, which is characterized in that described according to the target emotion is described
The corresponding target filter mode of images match to be processed includes:
Mood filter mapping table is obtained, records the mapping being in a bad mood with filter mode in the mood filter mapping table
Relationship, the mapping relations are a kind of corresponding at least one mood of filter mode;
The target filter mode corresponding with the target emotion is obtained from the mood filter mapping table.
7. method according to claim 1 to 6, which is characterized in that the mood data includes mood voice number
According to the target emotion that the identification mood data is reflected includes:
The mood voice data is converted into corresponding mood text data, and is extracted in the corresponding mood text data
Text feature;
Extract the acoustic feature in the mood voice data;
It calls the first mood model to carry out fusion recognition to the text feature and the acoustic feature, obtains the target feelings
Thread.
8. method according to claim 1 to 6, which is characterized in that the mood data includes mood voice number
According to the target emotion that the identification mood data is reflected includes:
The mood voice data is converted into corresponding mood text data, calls the second mood model to the corresponding feelings
Thread text data carries out semantic analysis, obtains the first mood;
It calls third mood model to carry out acoustic character to the mood voice data, obtains the second mood;
When the similarity between first mood and second mood is greater than or equal to first threshold, by first feelings
Thread or second mood are determined as the target emotion;
When the similarity between first mood and second mood is less than first threshold, first mood is determined
For target emotion.
9. method according to claim 1 to 6, which is characterized in that the mood data includes mood picture number
According to the target emotion that the identification mood data is reflected includes:
The target face expression in the mood image data is extracted, and obtains the third feelings that the target face expression is reflected
Thread;
The target limbs behavior in the mood image data is extracted, and obtains the 4th feelings that the target limbs behavior is reflected
Thread;
When the similarity between the third mood and the 4th mood is greater than or equal to second threshold, by the third feelings
Thread or the 4th mood are determined as the target emotion;
When the similarity between the third mood and the 4th mood is less than second threshold, the third mood is determined
For the target emotion.
10. according to the method described in claim 8, it is characterized in that, the second mood model of the calling is to the mood text
Data carry out semantic analysis, and obtaining the first mood includes:
Semantic analysis is carried out to the mood text data, obtains at least one candidate mood vocabulary;
The reference mood vocabulary for including in the candidate mood vocabulary and first mood model is subjected to similarity mode, is obtained
To the candidate mood vocabulary and the similarity with reference between mood vocabulary;
The mood that target emotion vocabulary is reflected is determined as first mood;
Wherein, the target emotion vocabulary is to meet the similarity at least one described candidate mood vocabulary to be greater than or equal to
Third threshold value, and the weight with reference to mood vocabulary is greater than or equal to vocabulary corresponding to the 4th threshold value, it is described to refer to mood
The weight of vocabulary is used to indicate the intensity of the mood reflected with reference to mood vocabulary.
11. according to the method described in claim 8, it is characterized in that, the calling third mood model is to the mood voice
Data carry out acoustic analysis, and obtaining the second mood includes:
Feature extraction is carried out to the mood voice data in the time domain, obtains time domain acoustic feature;
Feature extraction is carried out to the mood voice data on frequency domain, obtains frequency domain acoustic feature;
The time domain acoustic feature and the frequency domain acoustic feature are analyzed, second mood is obtained.
12. a kind of image processing method, which is characterized in that the described method includes:
In response to the dynamic release instruction in social application, mood data and image to be processed are obtained;The mood data packet
Include mood voice data, mood image data or mood text data;
It identifies the target emotion that the mood data is reflected, and is the images match to be processed and the target emotion pair
The target filter mode answered;
Filter processing is carried out to the image to be processed using the target filter mode, obtains target image;
The target image is issued in the social application.
13. a kind of image processing apparatus characterized by comprising
Acquiring unit, for obtaining mood data and image to be processed, the mood data includes mood voice data, mood
Image data or mood text data;
Recognition unit, the target emotion that the mood data is reflected for identification;
Matching unit, for being the corresponding target filter mode of the images match to be processed according to the target emotion;
Processing unit obtains target for carrying out filter processing to the image to be processed using the target filter mode
Image.
14. a kind of terminal, including input equipment and output equipment, which is characterized in that further include:
Processor is adapted for carrying out one or more instruction;And
Computer storage medium, the computer storage medium are stored with one or more instruction, one or more instruction
Suitable for being loaded by the processor and being executed the described in any item image processing methods of claim 1-11 as above.
15. a kind of computer storage medium, which is characterized in that the computer storage medium is stored with one or more instruction,
One or more instruction is suitable for being loaded by processor and executing the described in any item image procossings of claim 1-11 as above
Method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910693744.9A CN110442867A (en) | 2019-07-30 | 2019-07-30 | Image processing method, device, terminal and computer storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910693744.9A CN110442867A (en) | 2019-07-30 | 2019-07-30 | Image processing method, device, terminal and computer storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110442867A true CN110442867A (en) | 2019-11-12 |
Family
ID=68432176
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910693744.9A Pending CN110442867A (en) | 2019-07-30 | 2019-07-30 | Image processing method, device, terminal and computer storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110442867A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110879840A (en) * | 2019-11-19 | 2020-03-13 | 珠海格力电器股份有限公司 | Information feedback method, device and storage medium |
CN110991427A (en) * | 2019-12-25 | 2020-04-10 | 北京百度网讯科技有限公司 | Emotion recognition method and device for video and computer equipment |
EP4174849A1 (en) * | 2021-11-02 | 2023-05-03 | Capital One Services, LLC | Automatic generation of a contextual meeting summary |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107992824A (en) * | 2017-11-30 | 2018-05-04 | 努比亚技术有限公司 | Take pictures processing method, mobile terminal and computer-readable recording medium |
CN108537749A (en) * | 2018-03-29 | 2018-09-14 | 广东欧珀移动通信有限公司 | Image processing method, device, mobile terminal and computer readable storage medium |
CN108805089A (en) * | 2018-06-14 | 2018-11-13 | 南京云思创智信息科技有限公司 | Based on multi-modal Emotion identification method |
CN109254669A (en) * | 2017-07-12 | 2019-01-22 | 腾讯科技(深圳)有限公司 | A kind of expression picture input method, device, electronic equipment and system |
CN109325904A (en) * | 2018-08-28 | 2019-02-12 | 百度在线网络技术(北京)有限公司 | Image filters treating method and apparatus |
CN109660728A (en) * | 2018-12-29 | 2019-04-19 | 维沃移动通信有限公司 | A kind of photographic method and device |
CN109766759A (en) * | 2018-12-12 | 2019-05-17 | 成都云天励飞技术有限公司 | Emotion identification method and Related product |
-
2019
- 2019-07-30 CN CN201910693744.9A patent/CN110442867A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109254669A (en) * | 2017-07-12 | 2019-01-22 | 腾讯科技(深圳)有限公司 | A kind of expression picture input method, device, electronic equipment and system |
CN107992824A (en) * | 2017-11-30 | 2018-05-04 | 努比亚技术有限公司 | Take pictures processing method, mobile terminal and computer-readable recording medium |
CN108537749A (en) * | 2018-03-29 | 2018-09-14 | 广东欧珀移动通信有限公司 | Image processing method, device, mobile terminal and computer readable storage medium |
CN108805089A (en) * | 2018-06-14 | 2018-11-13 | 南京云思创智信息科技有限公司 | Based on multi-modal Emotion identification method |
CN109325904A (en) * | 2018-08-28 | 2019-02-12 | 百度在线网络技术(北京)有限公司 | Image filters treating method and apparatus |
CN109766759A (en) * | 2018-12-12 | 2019-05-17 | 成都云天励飞技术有限公司 | Emotion identification method and Related product |
CN109660728A (en) * | 2018-12-29 | 2019-04-19 | 维沃移动通信有限公司 | A kind of photographic method and device |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110879840A (en) * | 2019-11-19 | 2020-03-13 | 珠海格力电器股份有限公司 | Information feedback method, device and storage medium |
CN110991427A (en) * | 2019-12-25 | 2020-04-10 | 北京百度网讯科技有限公司 | Emotion recognition method and device for video and computer equipment |
EP4174849A1 (en) * | 2021-11-02 | 2023-05-03 | Capital One Services, LLC | Automatic generation of a contextual meeting summary |
US11967314B2 (en) | 2021-11-02 | 2024-04-23 | Capital One Services, Llc | Automatic generation of a contextual meeting summary |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20240168933A1 (en) | Ai story platform with customizable personality for education, entertainment, and therapy | |
US20220366281A1 (en) | Modeling characters that interact with users as part of a character-as-a-service implementation | |
CN106658129B (en) | Terminal control method and device based on emotion and terminal | |
US10706873B2 (en) | Real-time speaker state analytics platform | |
JP2022551788A (en) | Generate proactive content for ancillary systems | |
CN116547746A (en) | Dialog management for multiple users | |
CN114556333A (en) | Smart camera enabled by assistant system | |
US9754585B2 (en) | Crowdsourced, grounded language for intent modeling in conversational interfaces | |
US11562744B1 (en) | Stylizing text-to-speech (TTS) voice response for assistant systems | |
TW202132967A (en) | Interaction methods, apparatuses thereof, electronic devices and computer readable storage media | |
CN111081280B (en) | Text-independent speech emotion recognition method and device and emotion recognition algorithm model generation method | |
CN110442867A (en) | Image processing method, device, terminal and computer storage medium | |
Katayama et al. | Situation-aware emotion regulation of conversational agents with kinetic earables | |
CN107463684A (en) | Voice replying method and device, computer installation and computer-readable recording medium | |
CN112860213B (en) | Audio processing method and device, storage medium and electronic equipment | |
CN114138960A (en) | User intention identification method, device, equipment and medium | |
CN112673641A (en) | Inline response to video or voice messages | |
KR102413860B1 (en) | Voice agent system and method for generating responses based on user context | |
CN109961152B (en) | Personalized interaction method and system of virtual idol, terminal equipment and storage medium | |
CN110781329A (en) | Image searching method and device, terminal equipment and storage medium | |
US11887600B2 (en) | Techniques for interpreting spoken input using non-verbal cues | |
CN113301352B (en) | Automatic chat during video playback | |
Firc | Applicability of Deepfakes in the Field of Cyber Security | |
Mornatta | Steps towards the use of NLP in emotional theatre improvisation | |
CN110795581B (en) | Image searching method and device, terminal equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |