CN104462454A

CN104462454A - Character analyzing method

Info

Publication number: CN104462454A
Application number: CN201410780141.XA
Authority: CN
Inventors: 孟桂国
Original assignee: Shanghai Feixun Data Communication Technology Co Ltd
Current assignee: Shanghai Feixun Data Communication Technology Co Ltd
Priority date: 2014-12-17
Filing date: 2014-12-17
Publication date: 2015-03-25

Abstract

The invention provides a character analyzing method which is realized on the basis of a personal mobile terminal. The character analyzing method includes acquiring audio-video information in the personal mobile terminal through a data acquiring module, performing incomplete data filtering and redundant data filtering on the audio-video information acquired by the data acquiring module by a data filtering and analyzing module; using data after being filtered to establish an audio-video information database; comparing text similarity between the audio-video information database acquired by the data filtering and analyzing module and preset character attributes and a content corpus by a character analyzing module to judge character type to which a user of the personal mobile terminal belongs. The character analyzing method has the advantages that data volume is targeted, data security is guaranteed, data accuracy is higher and analysis result is more accurate.

Description

A kind of character analysis method

Technical field

The present invention relates to a kind of character analysis method.

Background technology

Children and adolescent period, the personality of a people moulds most important period just, busy due to aspects such as present crowd's work, causes, and the communication exchange with children is also fewer, causes understanding few to the following personality trend of children.And the complicacy of social phenomenon and human communication now, children are caused more easily to occur psychological problems, current children's character analysis mainly relies on Mental Health Counseling and teaches mechanism, and major part makes auxiliary property treatment behavior again when children's psychology personality defectiveness, lacks preventative.And some is made character analysis by data such as networks and also there is a little defect, network data is originated wide Analysis of Complex, and whether data are that my network behavior at ordinary times exists dispute, the accuracy of data is not high, and some data relates to individual privacy, network data is also easily acquired when transmitting, and safety has hidden danger.Also some analyzes personality (as judged user mood and personality according to call voice) by cellphone information, and data are here all call voices at that time, and the personality now analyzing out is relevant with user's mood at that time, not too accurately.

Summary of the invention

The invention provides a kind of character analysis method, data volume is targeted, data security is secure, data accuracy is higher, analysis result is more accurate.

In order to achieve the above object, the invention provides a kind of character analysis method, the method realizes based on individual mobile terminal, comprises following steps:

Step S1, data acquisition module obtain the audio/video information in individual mobile terminal;

Step S2, data filtering parsing module carry out deficiency of data filtration to the audio/video information that data acquisition module obtains and redundant data is filtered, and sets up audio/video information database by the data after filtering;

The audio/video information database that data filtering analysis module obtains by step S3, character analysis module and preset personality attribute and corpus of content carry out text similarity comparison, judge which kind of character type the user of individual mobile terminal belongs to.

In described step S1, data acquisition module obtains audio/video information from audio player, video player and webpage.

In described step S1, data acquisition module obtains audio-frequency information from audio player, and described audio-frequency information comprises audio frequency name, audio presentations person, album name, reproduction time and broadcasting time.

In described step S1, data acquisition module obtains video information from video player, and described video information comprises video name, reproduction time and broadcasting time.

In described step S1, data acquisition module obtains audio/video information from webpage, and described audio/video information comprises network address, browsing time, audio frequency and video name, audio frequency and video separator, audio presentations person/video presentations person, number of visits, whether plays mark and whether download mark.

When data acquisition module obtains audio/video information from webpage, if the audio frequency and video directly obtained are by name empty, then use web page contents analytic technique, the text message on webpage is resolved, from text message, obtains audio frequency and video name.

In described step S2, data filtering parsing module carries out deficiency of data bag filter to the audio frequency and video name that data acquisition module obtains and contains: carry out text similarity to audio frequency and video name and compare, if text similarity is more than or equal to threshold value, then thinks identical audio frequency and video.

In described step S2, data filtering parsing module carries out redundant data bag filter to the audio frequency and video name that data acquisition module obtains and contains: be called basis for estimation with audio frequency and video, removes and repeats record.

In described step S2, the audio/video information in described audio/video information database comprises audio frequency and video name, audio presentations person/video presentations person, audio frequency and video separator, reproduction time and broadcasting time.

Described personality attribute and corpus of content comprise audio frequency and video name, audio presentations person/video presentations person, audio frequency and video separator, audio frequency and video type and the character type corresponding with audio frequency and video type and personality content.

In described step S3, described character analysis module is called critical field with audio frequency and video, in the audio/video information database obtained by data filtering analysis module, broadcasting time exceedes and plays the audio/video information data of threshold value and preset personality attribute and corpus of content and carry out text similarity comparison, if the similarity of audio frequency and video name is more than or equal to compare threshold, then judge that the user of individual mobile terminal belongs to the character type of this audio frequency and video name correspondence.

The invention has the advantages that:

1, data source of the present invention gathers mainly for the amusement behavior of children's every day use mobile terminal, and data volume is targeted.

2, data of the present invention come from the native data of children's every day use mobile terminal, and without the need to through Internet Transmission, data security is secure.

3, the data collected are carried out data filtering parsing and character analysis carries out text similarity comparison by the present invention, make data accuracy higher.

4, personality attribute of the present invention and corpus of content, adopts autonomous learning to add rich language material database data, makes analysis result more accurate.

Accompanying drawing explanation

Fig. 1 is process flow diagram of the present invention.

Fig. 2 is the particular flow sheet in embodiments of the invention.

Embodiment

Illustrate preferred embodiment of the present invention according to Fig. 1 and Fig. 2 below.

As depicted in figs. 1 and 2, the invention provides a kind of character analysis method, the method realizes based on individual mobile terminal, comprises following steps:

Step S1, data acquisition module obtain the audio/video information in individual mobile terminal.

Described audio/video information derives from audio player, video player and webpage.

Described step S1 comprises following steps:

Step S1.1, data acquisition module obtain audio-frequency information from audio player, and described audio-frequency information comprises audio frequency name, audio presentations person, album name, reproduction time and broadcasting time, and above-mentioned audio-frequency information is inserted table 1.

Table 1

Step S1.2, data acquisition module obtain video information from video player, and described video information comprises video name, reproduction time and broadcasting time, and above-mentioned video information is inserted table 2.

Table 2

Step S1.3, data acquisition module obtain audio/video information from webpage, described audio/video information comprises network address, browsing time, audio frequency and video name, audio frequency and video separator, audio presentations person/video presentations person, number of visits, whether plays mark and whether download mark, and above-mentioned audio/video information is inserted table 3.

Table 3

In the present embodiment, described audio frequency and video separator, with " 0 " mark audio frequency, uses one token video; Described whether broadcasting mark, marks with " 0 " and does not play, play by one token; Described whether download mark, marks with " 0 " and does not download, download by one token.

Data acquisition module is when obtaining audio/video information from webpage, because when browsing or play some website, these webpages not necessarily provide complete audio frequency and video name, at this moment just need to use web page contents analytic technique, text message on webpage is resolved, from text message, obtains audio frequency and video name.

Therefore, in step S1.3, data acquisition module, when obtaining the audio frequency and video name in audio/video information, carries out step S1.4, judges whether from webpage, directly to obtain audio frequency and video name, namely, judge whether audio frequency and video name is empty, if so, then carries out step S1.5, if not, then directly audio frequency and video name is obtained.

Step S1.5, data acquisition module are resolved the text message on webpage, obtain audio frequency and video name from text message.

Data acquisition module utilization web page contents analytic technique carrys out the label on filtering web page, and finds crucial label to obtain text message, and crucial label here refers to similar song title, singer, types of songs, film, film types etc.

Step S2, data filtering parsing module carry out deficiency of data filtration to the audio frequency and video name that data acquisition module obtains and redundant data is filtered, and sets up audio/video information database by the data after filtering.

Described step S2 comprises following steps:

Step S2.1, data filtering parsing module carry out deficiency of data filtration to the audio frequency and video name that data acquisition module obtains and redundant data is filtered.

Described deficiency of data bag filter contains: carry out text similarity to audio frequency and video name and compare, if text similarity is more than or equal to threshold value, then thinks identical audio frequency and video.

In the present embodiment, threshold value is 80%, uses participle technique that audio frequency and video name is divided into several key word, then carries out text similarity to these key words and compare.

Described redundant data bag filter contains: be called basis for estimation with audio frequency and video, removes and repeats record.

Step S2.2, data filtering parsing module set up audio/video information database.

Audio/video information in described audio/video information database comprises audio frequency and video name, audio presentations person/video presentations person, audio frequency and video separator, reproduction time and broadcasting time.

Described audio/video information database is as shown in table 4.

Table 4

In the present embodiment, described audio frequency and video separator, with " 0 " mark audio frequency, uses one token video.

As shown in chart 5, described personality attribute and corpus of content comprise audio frequency and video name, audio presentations person/video presentations person, audio frequency and video separator, audio frequency and video type and the character type corresponding with audio frequency and video type and personality content.

Table 5

In the present embodiment, described audio frequency and video separator, with " 0 " mark audio frequency, uses one token video; Described audio frequency and video type, as audio frequency belong to popular, miss old times or old friends, sentiment, rock and roll etc.; Video belongs to ancient costume, violence, emotion, science fiction etc.; Described character type, as " sanguine temperament ", " quality of bile ", " lymphatic temperament " and " melancholy " four kinds of personality; Described personality content is the detailed explanation to character type.

Described character analysis module is called critical field with audio frequency and video, in the audio/video information database obtained by data filtering analysis module, broadcasting time exceedes and plays the audio/video information data of threshold value and preset personality attribute and corpus of content and carry out text similarity comparison, if the similarity of audio frequency and video name is more than or equal to compare threshold, then judge that the user of individual mobile terminal belongs to the character type of this audio frequency and video name correspondence.

In the present embodiment, described broadcasting threshold value is 1, and described compare threshold is 80%, uses participle technique that audio frequency and video name and audio presentations person/video presentations person are divided into several key word, then carries out text similarity to these key words and compare.

The autonomous learning that character analysis method provided by the invention also comprises step S4, personality attribute and corpus of content is perfect.

According to the result of character analysis, personality attribute and corpus of content add the corresponding relation of audio/video information data and character type further, make the data grows of personality attribute and corpus of content comprehensive, analyze the personality drawn more accurate.

The present invention, by analyzing the personality information obtained, is presented to user, is convenient to the personality grasping the user using individual mobile terminal, especially uses the personality of the children of individual mobile terminal, thus the healthy benign development of correct guiding children.The present invention not only can the personality whether defectiveness of Timeliness coverage children, and can analyze the following personality of children, and these personality had both embodied negative side also can embody positive one side; More crucially go out the personality of children according to these data accurate analysis, grasp in time the personality trend of children, whether defectiveness etc., and accomplish to prevent in advance.

Although content of the present invention has done detailed introduction by above preferred embodiment, will be appreciated that above-mentioned description should not be considered to limitation of the present invention.After those skilled in the art have read foregoing, for multiple amendment of the present invention and substitute will be all apparent.Therefore, protection scope of the present invention should be limited to the appended claims.

Claims

1. a character analysis method, is characterized in that, the method realizes based on individual mobile terminal, comprises following steps:

2. character analysis method as claimed in claim 1, it is characterized in that, in described step S1, data acquisition module obtains audio/video information from audio player, video player and webpage.

3. character analysis method as claimed in claim 2, it is characterized in that, in described step S1, data acquisition module obtains audio-frequency information from audio player, and described audio-frequency information comprises audio frequency name, audio presentations person, album name, reproduction time and broadcasting time.

4. character analysis method as claimed in claim 2, it is characterized in that, in described step S1, data acquisition module obtains video information from video player, and described video information comprises video name, reproduction time and broadcasting time.

5. character analysis method as claimed in claim 2, it is characterized in that, in described step S1, data acquisition module obtains audio/video information from webpage, and described audio/video information comprises network address, browsing time, audio frequency and video name, audio frequency and video separator, audio presentations person/video presentations person, number of visits, whether plays mark and whether download mark.

6. character analysis method as claimed in claim 5, it is characterized in that, when data acquisition module obtains audio/video information from webpage, if the audio frequency and video directly obtained are by name empty, then use web page contents analytic technique, text message on webpage is resolved, from text message, obtains audio frequency and video name.

7. as the character analysis method in claim 1-6 as described in any one, it is characterized in that, in described step S2, data filtering parsing module carries out deficiency of data bag filter to the audio frequency and video name that data acquisition module obtains and contains: carry out text similarity to audio frequency and video name and compare, if text similarity is more than or equal to threshold value, then think identical audio frequency and video.

8. character analysis method as claimed in claim 7, it is characterized in that, in described step S2, data filtering parsing module carries out redundant data bag filter to the audio frequency and video name that data acquisition module obtains and contains: be called basis for estimation with audio frequency and video, removes and repeats record.

9. character analysis method as claimed in claim 8, it is characterized in that, in described step S2, the audio/video information in described audio/video information database comprises audio frequency and video name, audio presentations person/video presentations person, audio frequency and video separator, reproduction time and broadcasting time.

10. character analysis method as claimed in claim 9, it is characterized in that, described personality attribute and corpus of content comprise audio frequency and video name, audio presentations person/video presentations person, audio frequency and video separator, audio frequency and video type and the character type corresponding with audio frequency and video type and personality content.

11. character analysis methods as claimed in claim 10, it is characterized in that, in described step S3, described character analysis module is called critical field with audio frequency and video, in the audio/video information database obtained by data filtering analysis module, broadcasting time exceedes and plays the audio/video information data of threshold value and preset personality attribute and corpus of content and carry out text similarity comparison, if the similarity of audio frequency and video name is more than or equal to compare threshold, then judge that the user of individual mobile terminal belongs to the character type of this audio frequency and video name correspondence.