CN109344253A

CN109344253A - Add method, apparatus, computer equipment and the storage medium of user tag

Info

Publication number: CN109344253A
Application number: CN201811089471.9A
Authority: CN
Inventors: 魏慕茹
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2018-09-18
Filing date: 2018-09-18
Publication date: 2019-02-15

Abstract

This application involves big data field, in particular to a kind of method, apparatus, computer equipment and storage medium for adding user tag.The described method includes: obtaining the text information of the personal information comprising user；Text information is inputted into the default Natural Language Processing Models based on NLP technology；It is handled according to Natural Language Processing Models, obtains the corresponding term vector of text information；Classified by default clustering algorithm to the corresponding term vector of text information, obtains the classification results of user property corresponding with the personal information of user；According to classification results, corresponding label is added to user.Obtain the text information of user, according to the understanding of text information, classify to user, and then corresponding label is added to user, by being automatically that user adds label, it is intended to existing user tag system is solved, using artificial addition tagged manner, cause to waste a large amount of manpowers, and is likely to result in the problem of a large number of users label addition mistake.

Description

Add method, apparatus, computer equipment and the storage medium of user tag

Technical field

This application involves big data field, in particular to a kind of method, apparatus for adding user tag, computer equipment and Storage medium.

Background technique

Existing user tag system be without automatic identification and judge user classification ability, can only be by manually setting The mode of label is set, this artificial addition tagged manner not only wastes a large amount of manpowers, due also to the standard manually judged is different, It is likely to result in the situation of a large number of users label addition mistake.

Apply for content

In view of the shortcomings of the prior art, the application proposes a kind of to add the method, apparatus of user tag, computer equipment and deposit Storage media obtains the text information of user, according to the understanding of text information, classifies to user, and then adds phase to user The label answered, it is intended to existing user tag system is solved, using artificial addition tagged manner, causes to waste a large amount of manpowers, with And it is likely to result in the problem of a large number of users label addition mistake.

The technical solution that the application proposes is:

A method of addition user tag, which comprises

Obtain the text information of the personal information comprising user；

The text information is inputted into the default Natural Language Processing Models based on NLP technology；

It is handled according to the Natural Language Processing Models, obtains the corresponding term vector of the text information；

Classified by default clustering algorithm to the corresponding term vector of the text information, obtains with the user The classification results of the corresponding user property of people's information；

According to the classification results, corresponding label is added to the user.

Further, in the step of acquisition includes the text information of the personal information of user, comprising:

Obtain the user data of the different user data type of the personal information comprising user；

According to the corresponding user data type of the user data, by default processing mode to the user data at Reason obtains the text information.

Further, described according to the corresponding user data type of the user data, by default processing mode to institute User data is stated to carry out in the step of processing obtains the text information, comprising:

When the user data type is voice messaging type, the user data is turned by preset ASR model Change text information into.

When the user data type is pictorial information type, mentioned by the preset Text region model based on OCR Take the text in the user data；

By the text conversion extracted at the text information.

Further, classified by default clustering algorithm to the corresponding term vector of the text information described, obtained In the step of obtaining the classification results of user property corresponding with the personal information of the user, comprising:

It is calculated in the corresponding term vector of preset multiple user properties by the method for cosine similarity, with the text envelope The included angle cosine value of corresponding term vector is ceased close to the target term vector of preset threshold；

Determine that the corresponding user property of the target term vector is user property corresponding with the personal information of the user Classification results.

Identify the suffix information of the user data；

According to the suffix information, user data type is judged；

According to the user data type, call default processing mode corresponding with the user data type to described User data carries out processing and obtains the text information.

Further, described according to the classification results, after the step of adding corresponding label to the user, institute The method of stating includes:

The label that the user has added is pushed to be the user service customer service.

The present invention also provides a kind of device for adding user tag, described device includes:

Module is obtained, for obtaining the text information of the personal information comprising user；

Input module, for the text information to be inputted the default Natural Language Processing Models based on NLP technology；

Term vector obtains module, and for handling according to the Natural Language Processing Models, it is corresponding to obtain the text information Term vector；

Categorization module is obtained for being classified by default clustering algorithm to the corresponding term vector of the text information The classification results of user property corresponding with the personal information of the user；

Label model is added, for adding corresponding label to the user according to the classification results.

The application also provides a kind of computer equipment, including memory and processor, and the memory is stored with computer Program, which is characterized in that the step of processor realizes method described in any of the above embodiments when executing the computer program.

The application also provides a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that institute State the step of realizing method described in any of the above embodiments when computer program is executed by processor.

According to above-mentioned technical solution, the application is the utility model has the advantages that obtain the text information of user, according to the reason of text information Solution, classifies to user, and then adds corresponding label to user, by being automatically that user adds label, it is intended to solve existing Some user tag systems cause to waste a large amount of manpowers, and be likely to result in a large number of users using artificial addition tagged manner The problem of label addition mistake.

Detailed description of the invention

Fig. 1 is the flow chart using the method for addition user tag provided by the embodiments of the present application；

Fig. 2 is the functional block diagram using the device of addition user tag provided by the embodiments of the present application；

Fig. 3 is the structural schematic block diagram using computer equipment provided by the embodiments of the present application.

Specific embodiment

It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, and It is not used in restriction the application.

As shown in Figure 1, the embodiment of the present application proposes a kind of method for adding user tag, the method includes following steps It is rapid:

Step S101, the text information of the personal information comprising user is obtained.

The text information of user is obtained, the text information comprising individual subscriber relevant information is mainly obtained.

In acquisition process, the user data not got is all text information, if the user data got is non- Text information is handled, and non-textual information is converted into text information.

In step s101, comprising:

Obtain the user data of the different user data type comprising personal information；

According to the corresponding user data type of user data, processing is carried out to user data by default processing mode and obtains text This information.

The user data comprising personal information is obtained, for user data there are a variety of user data types, user data is main Being includes pictorial information, text information and voice messaging three types, according to user data type difference, according to default processing side Formula is handled, and after the treatment, finally obtains text information.Processing mode is wherein preset to refer to user data by default place Reason mode can obtain the processing mode of text information.

Described according to the corresponding user data type of user data, user data is handled by default processing mode In the step of obtaining text information, comprising:

Identify the suffix information of user data；

According to the suffix information, user data type is judged；

According to the user data type, call default processing mode corresponding with the user data type to user Data carry out processing and obtain text information.

After getting user data, the suffix information of user data is identified, due to voice messaging, text information and figure The suffix information of piece information is all different, according to suffix information, judges user data type, that is, according to voice messaging, text The suffix information of information and pictorial information judges that user data is voice messaging, text information or pictorial information, further according to User data type calls default processing mode corresponding with user data type to obtain text information.

In the present embodiment, described according to the corresponding user data type of the user data, by default processing mode The user data is carried out in the step of processing obtains text information, comprising:

When user data type is voice messaging type, user data is converted by text by preset ASR model Information.

When user data type is voice messaging type, need for user data to be converted into text information, that is, will Voice messaging is converted, and text information is converted speech information into, for this purpose, obtaining text information.

When user data type is pictorial information type, used by the preset Text region model extraction based on OCR Text in user data；

By the text conversion extracted at text information.

When user data type is pictorial information type, need to convert user data, that is, picture is believed Breath is converted, firstly, extracting the text in user data, that is, extracts the text in pictorial information, is extracting completion Afterwards, by the text conversion extracted at text information, for this purpose, obtaining text information.Extracting the square cards for learning characters process in pictorial information In, if failing to extract text within a preset time, judge the user data for invalid data.Judging that user data is After invalid data, which is subjected to storage processing.

When user data type is text information type, text information is directly exported.

It when user data type is text information type, does not need to be handled, directly output text information, is This, obtains text information.

Specifically, it when needing according to user data to add label for user, needs first to obtain comprising individual subscriber phase The user data of information is closed, wherein user data can be the different types of use such as text information, voice messaging or pictorial information User data.Text information generally is not needed to carry out additional data processing to make directly as text information With；And for voice messaging, then it needs to be converted speech information into according to default processing mode as text information；Picture is believed Breath is then needed according to the text preset in processing mode extraction picture as text information.

Wherein, user data is converted into text information, that is, converted speech information into as the pre- of text information If processing mode specifically includes: voice messaging being converted to text information by preset ASR model, wherein above-mentioned preset ASR model needs are trained.The mode that wherein above-mentioned ASR model is trained are as follows: a large amount of sample data is first obtained, and Above-mentioned sample data is divided into training set and test set, wherein above-mentioned sample data includes voice messaging, and in above-mentioned voice Text information corresponding to word segment in information；The sample data of above-mentioned training set is input in preset ASR model It is trained, obtains the result training pattern for voice messaging to be converted to text information；The result obtained for training Voice messaging in the sample data of test set is input to result training pattern and is converted to text information knot by training pattern Fruit, by the way that text information corresponding to the word segment of above-mentioned text information result and the voice messaging of input is compared, It verifies whether to reach requirement, when being verified, then illustrates that ASR model training is completed.When above-mentioned voice messaging is input to ASR mould When in type, the text information of corresponding word segment can be converted to.

Above-mentioned ASR is the abbreviation of Automatic Speech Recognition, and Chinese is automatic speech recognition.

Wherein, the text in user data is extracted as text information, that is, extracts the text conduct in pictorial information The default processing mode of text information specifically includes: will be in pictorial information by the preset Text region model based on OCR Word segment extracts to obtain corresponding text information, wherein the above-mentioned preset Text region model based on OCR need into Row training.The wherein mode that the Text region model to above-mentioned based on OCR is trained are as follows: a large amount of sample data is first obtained, And above-mentioned sample data is divided into training set and test set, wherein above-mentioned sample data includes pictorial information and above-mentioned picture Text information corresponding to word segment in information；The sample data of above-mentioned training set is input to preset based on OCR's It is trained, is obtained for extracting the word segment in pictorial information to obtain the knot of text information in Text region model Fruit training pattern；For the result training pattern that training obtains, the pictorial information in the sample data of test set is input to knot Fruit training pattern is extracted to obtain text information as a result, by by the text in the pictorial information of above-mentioned text information result and input Text information corresponding to part compares, and verifies whether to reach requirement, when being verified, then illustrates the text based on OCR Identification model training is completed.When being input to above-mentioned pictorial information in the Text region model based on OCR, picture can be believed Word segment in breath extracts to obtain corresponding text information.

Above-mentioned OCR is the abbreviation of Optical Character Recognition, and Chinese is optical character identification.

It further, can be to above-mentioned figure before above-mentioned pictorial information being input in the Text region model based on OCR Piece information carries out picture pretreatment, its object is to reduce the garbage in image, convenient for the Text region model based on OCR Carry out feature extraction and training.Specifically, first by picture progress gray processing obtain the grayscale image of gray processing, wherein to picture into It can be the average value for finding out tri- components of R, G, B of each pixel that row, which is converted to the mode of grayscale image, then by this Average value is given to three components of this pixel.

For obtained grayscale image, using BM3D (Block-matching and 3D filtering, 3 dimension Block- matching filters Wave) noise reduction algorithm to grayscale image carry out noise reduction process.

Grayscale image after carrying out noise reduction carries out black white binarization and obtains black white image, wherein specifically can be by taking The mode that adaptive threshold is chosen obtains black white image.For by black white binarization, treated that black and white picture is input to base again Detect that word segment therein obtains text information in the Text region model of OCR.

Step S102, text information is inputted into the default Natural Language Processing Models based on NLP technology.

Step S103, it is handled according to Natural Language Processing Models, obtains the corresponding term vector of text information.

Step S104, classified by default clustering algorithm to the corresponding term vector of text information, obtained with user's The classification results of the corresponding user property of personal information.

For obtained above-mentioned text information, it is input in the default Natural Language Processing Models based on NLP technology and carries out Term vector is calculated.The mode that the above-mentioned default Natural Language Processing Models based on NLP technology are trained includes: first to obtain A large amount of sample data, and above-mentioned sample data is divided into training set and test set, wherein above-mentioned sample data includes text envelope Term vector corresponding to word segment in breath and above-mentioned text information.The sample data of above-mentioned training set is input to pre- If Natural Language Processing Models in be trained, obtain for term vector to be calculated according to text information result training mould Type.For the result training pattern that training obtains, the text information in the sample data of test set is input to result training mould Term vector is calculated as a result, by by word corresponding to the word segment of above-mentioned term vector result and the text information of input in type Vector compares, and verifies whether to reach requirement, when being verified, then illustrates the Natural Language Processing Models based on NLP technology Training is completed.When being input to above-mentioned text information in the default Natural Language Processing Models based on NLP technology, will calculate To term vector.

Above-mentioned NLP is the abbreviation of Natural Language Processing, and Chinese is natural language processing, and NLP is One subdomains of artificial intelligence (AI).

After obtaining term vector, classified by default clustering algorithm to term vector, for different term vectors, root The classification results of user property can be obtained according to default clustering algorithm, specifically, calculate the corresponding term vector of text information and pre- If the corresponding term vector similarity of user property, it is corresponding with preset user property according to the corresponding term vector of text information Term vector similarity highest one, obtain the classification results of user property.

For example, being input to " I has bought vehicle damage danger in 17 years " text information at the default natural language based on NLP technology It in reason model carries out that term vector is calculated, then by default clustering algorithm to " I has bought vehicle damage danger in 17 years " text information Corresponding term vector is classified, if " I has bought vehicle damage danger in 17 years " corresponding term vector of text information " has with user property The similarity highest of vehicle " obtains " having vehicle " classification results, to obtain point of user property corresponding with the personal information of user Class result.

Specifically, in step S104, comprising:

It is calculated in the corresponding term vector of preset multiple user properties by the method for cosine similarity, with text information pair Target term vector of the included angle cosine value for the term vector answered close to preset threshold；

Determine that the corresponding user property of target term vector is the classification knot of user property corresponding with the personal information of user Fruit.

The term vector angle in the corresponding term vector of preset multiple user properties is calculated by the method for cosine similarity Cosine value, and calculate the included angle cosine value of the corresponding term vector of text information, in the corresponding word of preset multiple user properties The included angle cosine value of term vector corresponding with text information is filtered out in vector close to the target term vector of preset threshold, according to mesh The corresponding user property of term vector is marked, determines that the corresponding user property of target term vector is use corresponding with the personal information of user The classification results of family attribute.

Specifically, it for above-mentioned term vector, needs to classify to above-mentioned term vector by default clustering algorithm and be used The classification results of user property can be obtained according to default clustering algorithm for different term vectors for the classification results of family attribute, Specifically, the word corresponding to the classification results for calculating above-mentioned term vector and user property in a manner of cosine similarity The similitude of vector, when the included angle cosine value of two term vectors is close to 1, then it represents that two term vectors are more similar, wherein on State user property classification results be classification results corresponding with user information, such as " having vehicle ", " having room ", " having children " and A variety of different classification results such as " well-educated ".

Further, it before inputting text information based on the default Natural Language Processing Models of NLP technology, can incite somebody to action Corresponding text information carries out uniform format, inputs the text information extremely default nature language based on NLP technology of unified format In speech processing model.Specifically, the lattice for the text information that can be handled in the default Natural Language Processing Models based on NLP technology Formula can be the formats such as CSV, TXT, EXCEL, for the text information of other formats, can be converted by way of pasting duplication At one of above-mentioned format.

Step S105, according to classification results, corresponding label is added to user.

The above-mentioned classification results about user property are obtained, are that user adds label according to above-mentioned attributive classification result, from And realize the text information by extracting different types of user from the text, picture, voice of client, judge the text of user The corresponding attribute of this information adds corresponding label to user according to the corresponding attribute of the text information of user.

After step S105, which comprises

The label that user has been added be pushed to be user service customer service.

When user links up with customer service, the label added to the user is pushed to customer service, enables client It is enough user to be answered targeted specifically and recommended products service.

In conclusion the text information for obtaining user classifies to user according to the understanding of text information, and then right User adds corresponding label, by being automatically that user adds label, it is intended to existing user tag system is solved, using artificial Tagged manner is added, causes to waste a large amount of manpowers, and is likely to result in the problem of a large number of users label addition mistake.

As shown in Fig. 2, the embodiment of the present application proposes that a kind of device 1 for adding user tag, device 1 include obtaining module 11, input module 12, term vector obtain module 13, categorization module 14 and addition label model 15.

Module 11 is obtained, for obtaining the text information of the personal information comprising user.

Obtaining module 11 includes:

First obtains module, the user data of the different user data type for obtaining the personal information comprising user；

First obtains module, for pressing default processing mode to user according to the corresponding user data type of user data Data carry out processing and obtain text information.

First, which obtains module, includes:

First sub- identification module, for identification suffix information of user data；

First sub- judgment module, for judging user data type according to the suffix information；

First sub- calling module, for calling corresponding with the user data type according to the user data type Default processing mode to user data carry out processing obtain text information.

In the present embodiment, the first acquisition module includes:

First son obtains module, is used for when user data type is voice messaging type, will by preset ASR model User data is converted into text information.

In the present embodiment, the first acquisition module includes:

Extraction module, for being known by the preset text based on OCR when user data type is pictorial information type Text in other model extraction user data；

Second son obtain module, for by the text conversion extracted at text information.

In the present embodiment, the first acquisition module includes:

Third obtains module, for directly exporting text information when user data type is text information type.

Device 1 includes:

Input module 12, for text information to be inputted the default Natural Language Processing Models based on NLP technology；

Term vector obtains module 13, for being handled according to Natural Language Processing Models, obtain the corresponding word of text information to Amount；

Categorization module 14, for being classified by default clustering algorithm to the corresponding term vector of text information, obtain with The classification results of the corresponding user property of the personal information of user.

Specifically, categorization module 13 includes:

Computing module calculates the corresponding term vector of preset multiple user properties for the method by cosine similarity In, the target term vector of the included angle cosine value of term vector corresponding with text information close to preset threshold；

Confirmation module, for determining that the corresponding user property of target term vector is user corresponding with the personal information of user The classification results of attribute.

Label model 15 is added, for adding corresponding label to user according to classification results.

Device 1 further include:

Pushing module, the label for having added user be pushed to be user service customer service.

As shown in figure 3, also providing a kind of computer equipment in the embodiment of the present application, which can be service Device, internal structure can be as shown in Figure 3.The computer equipment includes processor, the memory, net connected by system bus Network interface and database.Wherein, the processor of the Computer Design is for providing calculating and control ability.The computer equipment Memory includes non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer journey Sequence and database.The internal memory provides environment for the operation of operating system and computer program in non-volatile memory medium. The database of the computer equipment is used to store the data such as the model of method of addition user tag.The network of the computer equipment Interface is used to communicate with external terminal by network connection.To realize a kind of addition when the computer program is executed by processor The method of user tag.

Above-mentioned processor executes the step of method of above-mentioned addition user tag: obtaining the text of the personal information comprising user This information；The text information is inputted into the default Natural Language Processing Models based on NLP technology；At the natural language Model treatment is managed, the corresponding term vector of the text information is obtained；It is corresponding to the text information by default clustering algorithm Term vector is classified, and the classification results of user property corresponding with the personal information of the user are obtained；According to the classification As a result, adding corresponding label to the user.

In one embodiment, in the step of above-mentioned acquisition includes the text information of the personal information of user, comprising:

Obtain the different types of user data comprising personal information；

According to the corresponding user data type of the user data, by default processing mode to the user data at Reason obtains text information.

In one embodiment, above-mentioned according to the corresponding user data type of the user data, by default processing mode The user data is carried out in the step of processing obtains text information, comprising:

When user data type is voice messaging type, the voice messaging is converted by preset ASR model Text information.

When user data type is pictorial information type, pass through the preset Text region model extraction institute based on OCR State the text in pictorial information；

By the text conversion extracted at text information.

In one embodiment, the corresponding term vector of the text information is divided above by default clustering algorithm In the step of class, the classification results of acquisition user property corresponding with the personal information of the user, comprising:

Identify the suffix information of user data；

According to the suffix information, user data type is judged；

According to the user data type, call default processing mode corresponding with the user data type to described User data carries out processing and obtains text information.

In one embodiment, above-mentioned according to the classification results, the step of corresponding label is added to the user it Afterwards, comprising:

It will be understood by those skilled in the art that structure shown in Fig. 3, only part relevant to application scheme is tied The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme.

The computer equipment of the embodiment of the present application obtains the text information of user, according to the understanding of text information, to user Classify, and then corresponding label is added to user, by being automatically that user adds label, it is intended to solve existing user's mark Label system causes to waste a large amount of manpowers using artificial addition tagged manner, and is likely to result in a large number of users label addition mistake Accidentally the problem of.

One embodiment of the application also provides a kind of computer readable storage medium, is stored thereon with computer program, calculates Machine program realizes a kind of method for adding user tag when being executed by processor, specifically: obtain the personal information comprising user Text information；The text information is inputted into the default Natural Language Processing Models based on NLP technology；According to the natural language Speech processing model treatment, obtains the corresponding term vector of the text information；By default clustering algorithm to the text information pair The term vector answered is classified, and the classification results of user property corresponding with the personal information of the user are obtained；According to described Classification results add corresponding label to the user.

Obtain the different types of user data comprising personal information；

By the text conversion extracted at text information.

Identify the suffix information of user data；

According to the suffix information, user data type is judged；

The storage medium of the embodiment of the present application obtains the text information of user, according to the understanding of text information, to user into Row classification, and then corresponding label is added to user, by being automatically that user adds label, it is intended to solve existing user tag System causes to waste a large amount of manpowers using artificial addition tagged manner, and is likely to result in a large number of users label addition mistake The problem of.

Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, Any reference used in provided herein and embodiment to memory, storage, database or other media, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double speed are according to rate SDRAM (SSRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..

The foregoing is merely the preferred embodiments of the application, not to limit the application, all essences in the application Made any modifications, equivalent replacements, and improvements etc., should be included within the scope of protection of this application within mind and principle.

Claims

1. a kind of method for adding user tag, which is characterized in that the described method includes:

Obtain the text information of the personal information comprising user；

Classified by default clustering algorithm to the corresponding term vector of the text information, obtains and believe with the personal of the user Cease the classification results of corresponding user property；

2. the method for addition user tag according to claim 1, which is characterized in that in obtained comprising user In the step of text information of people's information, comprising:

According to the corresponding user data type of the user data, processing is carried out to the user data by default processing mode and is obtained Obtain the text information.

3. the method for addition user tag according to claim 2, which is characterized in that described according to the user data Corresponding user data type carries out the step of processing obtains the text information to the user data by default processing mode In, comprising:

When the user data type is voice messaging type, the user data is converted by preset ASR model The text information.

4. the method for addition user tag according to claim 2, which is characterized in that described according to the user data Corresponding user data type carries out the step of processing obtains the text information to the user data by default processing mode In, comprising:

When the user data type is pictorial information type, pass through the preset Text region model extraction institute based on OCR State the text in user data；

By the text conversion extracted at the text information.

5. the method for addition user tag according to claim 1, which is characterized in that described by presetting clustering algorithm Classify to the corresponding term vector of the text information, obtains point of user property corresponding with the personal information of the user In the step of class result, comprising:

It is calculated in the corresponding term vector of preset multiple user properties by the method for cosine similarity, with the text information pair Target term vector of the included angle cosine value for the term vector answered close to preset threshold；

Determine that the corresponding user property of the target term vector is point of user property corresponding with the personal information of the user Class result.

6. the method for addition user tag according to claim 2, which is characterized in that described according to the user data Corresponding user data type carries out the step of processing obtains the text information to the user data by default processing mode In, comprising:

Identify the suffix information of the user data；

According to the suffix information, user data type is judged；

According to the user data type, call default processing mode corresponding with the user data type to the user Data carry out processing and obtain the text information.

7. the method for addition user tag according to claim 1, which is characterized in that tied described according to the classification Fruit, to the user add corresponding label the step of after, which comprises

8. a kind of device for adding user tag, which is characterized in that described device includes:

Term vector obtains module, for handling according to the Natural Language Processing Models, obtains the corresponding word of the text information Vector；

Categorization module, for being classified by default clustering algorithm to the corresponding term vector of the text information, acquisition and institute State the classification results of the corresponding user property of personal information of user；

9. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists In the processor realizes method described in any one of claims 1 to 7 when executing computer program the step of.

10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of method described in any one of claims 1 to 7 is realized when being executed by processor.