CN105528618B - A kind of short picture text recognition method and device based on social networks - Google Patents

A kind of short picture text recognition method and device based on social networks Download PDF

Info

Publication number
CN105528618B
CN105528618B CN201510907490.8A CN201510907490A CN105528618B CN 105528618 B CN105528618 B CN 105528618B CN 201510907490 A CN201510907490 A CN 201510907490A CN 105528618 B CN105528618 B CN 105528618B
Authority
CN
China
Prior art keywords
picture text
short
short picture
text
ability label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510907490.8A
Other languages
Chinese (zh)
Other versions
CN105528618A (en
Inventor
李金奎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weimeng Chuangke Network Technology China Co Ltd
Original Assignee
Weimeng Chuangke Network Technology China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weimeng Chuangke Network Technology China Co Ltd filed Critical Weimeng Chuangke Network Technology China Co Ltd
Priority to CN201510907490.8A priority Critical patent/CN105528618B/en
Publication of CN105528618A publication Critical patent/CN105528618A/en
Application granted granted Critical
Publication of CN105528618B publication Critical patent/CN105528618B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • G06F18/2193Validation; Performance evaluation; Active pattern learning techniques based on specific statistical tests
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/285Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the present invention provides a kind of short picture text recognition method and device based on social networks, and the short picture text recognition method based on social networks includes: to receive the short picture text based on social networks;The picture number of the short picture text, the comment feature of the short picture text are obtained, and obtains the ability label that user corresponding to the short picture text constructed in advance has;Utilize short picture text classifier, determine the corresponding ability label of the short picture text, wherein, the short picture text classifier is that the ability label that the user according to corresponding to the picture number of the short picture text, the comment feature of the short picture text and the short picture text has trains;For the short picture text, corresponding ability label is set.Technical solution of the present invention introduces the ability label and comment feature of publisher, and multiple data sources are introduced except the information that picture itself includes, can effectively improve the accuracy rate and coverage rate of short picture text classification.

Description

A kind of short picture text recognition method and device based on social networks
Technical field
The present invention relates to social networks technical field more particularly to a kind of short picture text identification sides based on social networks Method and device.
Background technique
Several concepts involved in the short topic text identification field of social networks: short picture text refers in microblogging etc. In social networks, blog article that some accounts are delivered without or have a small amount of text information, it is main by the N that is delivered in blog article (N > 1) picture conveys information, and such blog article belongs to short picture text.User capability label refers to description user in social networks In by from fill out information, the label for the ability characteristics that the information such as the blog article delivered are showed.The corresponding ability of short picture text Label refers to the ability label for a certain piece blog article content that description user delivers.
Scheme of the prior art based on image recognition is mainly to be then based on using a large amount of pictures of mark as training set Training set learns an image recognition model out, carries out discriminance analysis, identification to the picture newly delivered using the model learnt out The information such as the personage in picture, animal, article out, the information that then will identify that are mapped with content tab.There is figure in it Piece recognition accuracy is relatively low, and False Rate is higher, and the pictorial information identified is lacked with the big technology of user tag association difficulty It falls into.
Summary of the invention
The embodiment of the present invention provides a kind of short picture text recognition method and device based on social networks, to effectively improve The accuracy rate and coverage rate of short picture text classification.
On the one hand, the embodiment of the invention provides a kind of short picture text recognition method based on social networks, the base Include: in the short picture text recognition method of social networks
Receive the short picture text based on social networks;
The picture number of the short picture text, the comment feature of the short picture text are obtained, and obtains building in advance The short picture text corresponding to the ability label that has of user;
Using short picture text classifier, the corresponding ability label of the short picture text is determined, wherein the short picture Text classifier is according to the picture number of the short picture text, the comment feature of the short picture text and the short picture What the ability label that user corresponding to text has trained;
For the short picture text, corresponding ability label is set.
On the other hand, the embodiment of the invention provides a kind of short picture text identification device based on social networks, it is described Short picture text identification device based on social networks includes:
Receiving unit, for receiving the short picture text based on social networks;
Acquiring unit, for obtaining the picture number of the short picture text, the comment feature of the short picture text, and Obtain the ability label that user corresponding to the short picture text constructed in advance has;
Taxon, for determining the corresponding ability label of the short picture text using short picture text classifier, In, the short picture text classifier is special according to the comment of the picture number, the short picture text of the short picture text Seek peace what the ability label that user corresponding to the short picture text has trained;
Tag unit, for corresponding ability label to be arranged for the short picture text.
Above-mentioned technical proposal has the following beneficial effects: the short picture text classification identification of the prior art, identifies first Object or person object in picture, then maps recognition result with label again, during identification and mapping, benefit All it is only the information of picture itself, will cause accuracy rate and coverage rate is relatively low.And technical solution of the present invention introduces publication The ability label and comment feature of person, multiple data sources are introduced except the information that picture itself includes, can be effectively improved short The accuracy rate and coverage rate of picture text classification.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.
Fig. 1 is a kind of short picture text recognition method flow chart based on social networks of the embodiment of the present invention;
Fig. 2 is a kind of short picture text identification apparatus structure schematic diagram based on social networks of the embodiment of the present invention;
Fig. 3 is short picture text identification apparatus structure schematic diagram of the another kind of the embodiment of the present invention based on social networks;
Fig. 4 is the short picture text classification recognition result table of application example of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
As shown in Figure 1, being a kind of short picture text recognition method flow chart based on social networks of the embodiment of the present invention, institute Stating the short picture text recognition method based on social networks includes:
101, the short picture text based on social networks is received;
102, the picture number of the short picture text, the comment feature of the short picture text are obtained, and is obtained preparatory The ability label that user corresponding to the short picture text of building has;
103, using short picture text classifier, the corresponding ability label of the short picture text is determined, wherein described short Picture text classifier is according to the picture number of the short picture text, the comment feature of the short picture text and described short What the ability label that user corresponding to picture text has trained;
104, corresponding ability label is set for the short picture text.
Preferably, described that for the short picture text, corresponding ability label is set, it specifically includes: utilizing what is constructed in advance The ability label and its corresponding weight that user corresponding to the short picture text has, to the short picture text determined Corresponding ability label is weighted or drops power amendment;Wherein, the calculation formula for the corresponding weight of ability label that user has Are as follows:Count is that user corresponding to short picture text is assigned to related point of ability label mapping concentration The total degree of group, the ability label mapping collection, which refers to have with the ability label of user corresponding to the short picture text, to be associated with The tag set of relationship;The revised ability label of power is weighted or dropped for the short picture text setting is corresponding.
Preferably, the side of user's ability label having corresponding to the short picture text and its corresponding weight is constructed Method specifically includes: Utilization ability label mapping collection and other users believe the grouping of user corresponding to the short picture text Breath, constructs the ability label and its corresponding weight that user corresponding to the short picture text has.
Preferably, according to the picture number of the short picture text, the comment feature of the short picture text and described short The method for the short picture text classifier of ability label training that user corresponding to picture text has, comprising:
For the field that pre-set short picture text often occurs, related short picture text in the field is extracted Comment information is taken out using whole comment informations of every short picture text as an article using tfidf formula as training set The comment feature for taking out the field, the comment feature in the field based on extraction is as training set, the short picture text of pre-training This classifier, wherein the tfidf formula are as follows:
tfidfi,j=tfi,j×idfi
Andni,jExpression appears in the frequency of n-th of word in jth piece article;
| D | indicate the text sum in training set;
By the picture number of the short picture text, the comment feature of the short picture text and the short picture text institute Characteristic set of the ability label that corresponding user has as short picture text classifier, the short picture text that pre-training is determined This classifier optimizes training, trains the short picture text classifier for short picture text classification.
Preferably, described to utilize short picture text classifier, determine the corresponding ability label of the short picture text, specifically It include: that the short picture text is determined using the short picture text classifier after optimization training for the short picture text The fields are determined as the corresponding ability label of the short picture text by fields.
Corresponding to above method embodiment, as shown in Fig. 2, being a kind of short picture based on social networks of the embodiment of the present invention Text identification apparatus structure schematic diagram, the short picture text identification device based on social networks include:
Receiving unit 21, for receiving the short picture text based on social networks;
Acquiring unit 22, for obtaining the picture number of the short picture text, the comment feature of the short picture text, And obtain the ability label that user corresponding to the short picture text constructed in advance has;
Taxon 23, for determining the corresponding ability label of the short picture text using short picture text classifier, Wherein, the short picture text classifier is the comment according to the picture number, the short picture text of the short picture text What the ability label that user corresponding to feature and the short picture text has trained;
Tag unit 24, for corresponding ability label to be arranged for the short picture text.
Preferably, as shown in figure 3, being short picture text identification device of the another kind of the embodiment of the present invention based on social networks Structural schematic diagram, the short picture text identification device based on social networks not only includes: receiving unit 21, acquiring unit 22, taxon 23 and tag unit 24, the short picture text identification device based on social networks further include:
Amending unit 25, ability label for having using user corresponding to the short picture text that constructs in advance and Its corresponding weight is weighted or drops power amendment to the corresponding ability label of the short picture text determined;Wherein, it uses The calculation formula for the corresponding weight of ability label that family has are as follows:Count is short picture text institute Corresponding user is assigned to the total degree that ability label mapping concentrates associated packets, the ability label mapping collection refer to it is described short The ability label of user corresponding to picture text has the tag set of incidence relation;
The tag unit 24 is specifically used for that corresponding weighting is arranged for the short picture text or drop weighs revised energy Power label.
Preferably, the short picture text identification device based on social networks further include:
Construction unit 26, for Utilization ability label mapping collection and other users to corresponding to the short picture text The grouping information of user constructs ability label and its corresponding weight that user corresponding to the short picture text has.
Preferably, the short picture text identification device based on social networks further include:
Training unit 27, the field for often occurring for pre-set short picture text, is extracted in the field The comment information of related short picture text is as training set, using whole comment informations of every short picture text as a text Chapter goes out the comment feature in the field using tfidf formula extraction, and the comment feature in the field based on extraction is as training Collection, the short picture text classifier of pre-training, wherein the tfidf formula are as follows:
tfidfi,j=tfi,j×idfi
Andni,jExpression appears in the frequency of n-th of word in jth piece article;
| D | indicate the text sum in training set;
By the picture number of the short picture text, the comment feature of the short picture text and the short picture text institute Characteristic set of the ability label that corresponding user has as short picture text classifier, the short picture text that pre-training is determined This classifier optimizes training, trains the short picture text classifier for short picture text classification.
Preferably, the taxon 23 is further specifically used for for the short picture text, after optimization training Short picture text classifier determine the fields of the short picture text, which is determined as the short picture text This corresponding ability label.
Application example is lifted below to be described in detail:
Application example technical solution of the present invention is primarily based on social networks, constructs the ability label of user;Next it extracts The comment feature of short picture blog article, in social networks, as the feedback to short the wanted expressing information of picture blog article, short picture is rich The comment of text can fully and effectively reflect the characteristic information of short picture blog article;Finally by ability label, the Yong Huping of user By in feature and short picture blog article picture number and other text informations as feature, carried out with relevant classifier short The Classification and Identification of picture blog article.
Specific step is as follows:
1, it constructs the ability label of user: information and ability label mapping collection being grouped based on user, construct user's Ability label.
2, extract the comment feature of short picture blog article: the short picture blog article training set of building picture related fields extracts him Comment information, regard whole comment informations of every short picture blog article as an article, calculate specific neck using tfidf The comment characteristic set of the short picture blog article in domain.
3, short picture text classification identification: user capability label and comment feature based on the building of step before, in conjunction with figure The quantity of piece and other text features are that corresponding ability label is arranged in short picture blog article using disaggregated model.
One, the ability label of user is constructed
In the social networks such as microblogging, bean vermicelli has embodied the grouping information of user the ability mark that the user has Label.Using the ability label mapping collection having had been built up, in conjunction with bean vermicelli for the grouping information of user, application example of the present invention can With construct user ability label and its corresponding weight.
The weight calculation formula of specific ability label are as follows:
(formula 1)
Wherein, count is the total degree that user is assigned to that ability label mapping concentrates associated packets
Table 1: user capability label list
Two, the comment feature of short picture blog article is extracted
The comment information of related short picture blog article is extracted in the fields such as tourism, the automobile often occurred for short picture blog article It is each out using tfidf formula extraction using whole comment informations of every short picture blog article as an article as training set The correlated characteristic in field, the text feature as subsequent short picture blog article classifier.
Tf formula:(formula 2)
Wherein ni,jExpression appears in the frequency of n-th of word in jth piece article
Idf formula:(formula 3)
Wherein | D | indicate the text sum in training set
Tfidf formula: tfidfi,j=tfi,j×idfi(formula 4)
The comment feature and weight extracted is as shown in table 2 below:
Table 2: tourism, automotive field comment feature and weight table
Three, short picture text classification
The short picture text collection of the related fieldss such as tourism, the automobile extracted based on step before is utilized as training set Spy of the picture comment feature for the related fields extracted in the ability label of user, picture number and second step as classifier Collection is closed, and is based on naive Bayesian or SVM (Support Vector Machine, vector machine) model, is trained for short figure The classifier of piece text classification.
For new short picture text, extracts the characteristic information that above-mentioned classifier needs and be used as input, by training Classifier, the fields for exporting text are that corresponding ability label is arranged in the short picture text.As shown in figure 4, for this The short picture text classification recognition result table of invention application example, blog article label therein are the corresponding ability mark of short picture text Label.
The short picture text classification identification of the prior art, identifies the object or person object in picture first, then will know Other result is mapped with label again, and during identification and mapping, what is utilized is all only the information of picture itself, can be made It is relatively low at accuracy rate and coverage rate.And application example technical solution of the present invention introduces the ability label and comment feature of publisher, Multiple data sources are introduced except the information that picture itself includes, and can effectively improve the accuracy rate of short picture text classification and are covered Lid rate.
It should be understood that the particular order or level of the step of during disclosed are the examples of illustrative methods.Based on setting Count preference, it should be appreciated that in the process the step of particular order or level can be in the feelings for the protection scope for not departing from the disclosure It is rearranged under condition.Appended claim to a method is not illustratively sequentially to give the element of various steps, and not It is to be limited to the particular order or level.
In above-mentioned detailed description, various features are combined together in single embodiment, to simplify the disclosure.No This published method should be construed to reflect such intention, that is, the embodiment of theme claimed needs to compare The more features of the feature clearly stated in each claim.On the contrary, as appended claims is reflected Like that, the present invention is in the state fewer than whole features of disclosed single embodiment.Therefore, appended claims It is hereby expressly incorporated into detailed description, wherein each claim is used as alone the individual preferred embodiment of the present invention.
For can be realized any technical staff in the art or using the present invention, above to disclosed embodiment into Description is gone.To those skilled in the art;The various modifications mode of these embodiments will be apparent from, and this The General Principle of text definition can also be suitable for other embodiments on the basis of not departing from the spirit and scope of the disclosure. Therefore, the disclosure is not limited to embodiments set forth herein, but most wide with principle disclosed in the present application and novel features Range is consistent.
Description above includes the citing of one or more embodiments.Certainly, in order to describe above-described embodiment and description portion The all possible combination of part or method is impossible, but it will be appreciated by one of ordinary skill in the art that each implementation Example can do further combinations and permutations.Therefore, embodiment described herein is intended to cover fall into the appended claims Protection scope in all such changes, modifications and variations.In addition, with regard to term used in specification or claims The mode that covers of "comprising", the word is similar to term " includes ", just as " including " solved in the claims as transitional word As releasing.In addition, the use of any one of specification in claims term "or" being to indicate " non-exclusionism Or ".
Those skilled in the art will also be appreciated that the various illustrative components, blocks that the embodiment of the present invention is listed (illustrative logical block), unit and step can by electronic hardware, computer software, or both knot Conjunction is realized.For the replaceability (interchangeability) for clearly showing that hardware and software, above-mentioned various explanations Property component (illustrative components), unit and step universally describe their function.Such function It can be that the design requirement for depending on specific application and whole system is realized by hardware or software.Those skilled in the art Can be can be used by various methods and realize the function, but this realization is understood not to for every kind of specific application Range beyond protection of the embodiment of the present invention.
Various illustrative logical blocks or unit described in the embodiment of the present invention can by general processor, Digital signal processor, specific integrated circuit (ASIC), field programmable gate array or other programmable logic devices, discrete gate Or transistor logic, discrete hardware components or above-mentioned any combination of design carry out implementation or operation described function.General place Managing device can be microprocessor, and optionally, which may be any traditional processor, controller, microcontroller Device or state machine.Processor can also be realized by the combination of computing device, such as digital signal processor and microprocessor, Multi-microprocessor, one or more microprocessors combine a digital signal processor core or any other like configuration To realize.
The step of method described in the embodiment of the present invention or algorithm can be directly embedded into hardware, processor execute it is soft The combination of part module or the two.Software module can store in RAM memory, flash memory, ROM memory, EPROM storage Other any form of storaging mediums in device, eeprom memory, register, hard disk, moveable magnetic disc, CD-ROM or this field In.Illustratively, storaging medium can be connect with processor, so that processor can read information from storaging medium, and It can be to storaging medium stored and written information.Optionally, storaging medium can also be integrated into the processor.Processor and storaging medium can To be set in asic, ASIC be can be set in user terminal.Optionally, processor and storaging medium also can be set in In different components in the terminal of family.
In one or more exemplary designs, above-mentioned function described in the embodiment of the present invention can be in hardware, soft Part, firmware or any combination of this three are realized.If realized in software, these functions be can store and computer-readable On medium, or it is transferred on a computer readable medium in the form of one or more instructions or code forms.Computer readable medium includes electricity Brain storaging medium and convenient for so that computer program is allowed to be transferred to from a place telecommunication media in other places.Storaging medium can be with It is that any general or special computer can be with the useable medium of access.For example, such computer readable media may include but It is not limited to RAM, ROM, EEPROM, CD-ROM or other optical disc storages, disk storage or other magnetic storage devices or other What can be used for carry or store with instruct or data structure and it is other can be by general or special computer or general or specially treated The medium of the program code of device reading form.In addition, any connection can be properly termed computer readable medium, example Such as, if software is to pass through a coaxial cable, fiber optic cables, double from a web-site, server or other remote resources Twisted wire, Digital Subscriber Line (DSL) are defined with being also contained in for the wireless way for transmitting such as example infrared, wireless and microwave In computer readable medium.The disk (disk) and disk (disc) includes compress disk, radium-shine disk, CD, DVD, floppy disk And Blu-ray Disc, disk is usually with magnetic replicate data, and disk usually carries out optically replicated data with laser.Combinations of the above Also it may be embodied in computer readable medium.
Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effects It is described in detail, it should be understood that being not intended to limit the present invention the foregoing is merely a specific embodiment of the invention Protection scope, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should all include Within protection scope of the present invention.

Claims (8)

1. a kind of short picture text recognition method based on social networks characterized by comprising
Receive the short picture text based on social networks;
The picture number of the short picture text, the comment feature of the short picture text are obtained, and obtains the institute constructed in advance State the ability label that user corresponding to short picture text has;
Using short picture text classifier, the corresponding ability label of the short picture text is determined, wherein the short picture text Classifier is according to the picture number of the short picture text, the comment feature of the short picture text and the short picture text What the ability label that corresponding user has trained;
Corresponding ability label is set for the short picture text, specifically includes: utilizing the short picture text constructed in advance The ability label and its corresponding weight that corresponding user has, to the corresponding ability label of the short picture text determined It is weighted or drops power amendment;The revised ability label of power is weighted or dropped for the short picture text setting is corresponding;Wherein, The calculation formula for the corresponding weight of ability label that user has are as follows:Count is short picture text Corresponding user be assigned to ability label mapping concentrate associated packets total degree, the ability label mapping collection refer to it is described The ability label of user corresponding to short picture text has the tag set of incidence relation.
2. the short picture text recognition method based on social networks as described in claim 1, which is characterized in that the building short figure The method of ability label and its corresponding weight that user corresponding to piece text has, specifically includes:
The grouping information of Utilization ability label mapping collection and other users to user corresponding to the short picture text, building The ability label and its corresponding weight that user corresponding to the short picture text has out.
3. the short picture text recognition method based on social networks as described in claim 1, which is characterized in that according to the short figure The ability that user corresponding to the picture number of piece text, the comment feature of the short picture text and the short picture text has The method of the short picture text classifier of label training, comprising:
For the field that pre-set short picture text often occurs, the comment of related short picture text in the field is extracted Information is gone out using whole comment informations of every short picture text as an article using tfidf formula extraction as training set The comment feature in the field, the comment feature in the field based on extraction is as training set, the short picture text of pre-training point Class device, wherein the tfidf formula are as follows:
tfidfi,j=tfi,j×idfi
Andni,jExpression appears in the frequency of i-th of word in jth piece article;
| D | indicate the text sum in training set;
It will be corresponding to the picture number of the short picture text, the comment feature of the short picture text and the short picture text Characteristic set of the ability label that user has as short picture text classifier, the short picture text point that pre-training is determined Class device optimizes training, trains the short picture text classifier for short picture text classification.
4. the short picture text recognition method based on social networks as claimed in claim 3, which is characterized in that described to utilize short figure Piece text classifier determines the corresponding ability label of the short picture text, specifically includes:
For the short picture text, the institute of the short picture text is determined using the short picture text classifier after optimization training The fields are determined as the corresponding ability label of the short picture text by category field.
5. a kind of short picture text identification device based on social networks, which is characterized in that the short figure based on social networks Piece text identification device includes:
Receiving unit, for receiving the short picture text based on social networks;
Acquiring unit for obtaining the picture number of the short picture text, the comment feature of the short picture text, and obtains The ability label that user corresponding to the short picture text constructed in advance has;
Taxon determines the corresponding ability label of the short picture text for utilizing short picture text classifier, wherein The short picture text classifier be according to the picture number of the short picture text, the comment feature of the short picture text and What the ability label that user corresponding to the short picture text has trained;
Amending unit, ability label and its correspondence for having using user corresponding to the short picture text constructed in advance Weight, be weighted or drop power amendment to the corresponding ability label of the short picture text determined;Wherein, user has The corresponding weight of ability label calculation formula are as follows:Count, which is that short picture text is corresponding, to be used Family is assigned to the total degree that ability label mapping concentrates associated packets, and the ability label mapping collection refers to and the short picture text The ability label of user corresponding to this has the tag set of incidence relation;
Tag unit weighs revised ability label for corresponding weighting to be arranged or drops for the short picture text.
6. the short picture text identification device based on social networks as claimed in claim 5, which is characterized in that described based on social activity The short picture text identification device of network further include:
Construction unit, for Utilization ability label mapping collection and other users to user's corresponding to the short picture text Grouping information constructs ability label and its corresponding weight that user corresponding to the short picture text has.
7. the short picture text identification device based on social networks as claimed in claim 5, which is characterized in that described based on social activity The short picture text identification device of network further include:
Training unit, the field for often occurring for pre-set short picture text are extracted related short in the field The comment information of picture text is utilized as training set using whole comment informations of every short picture text as an article Tfidf formula extraction goes out the comment feature in the field, and the comment feature in the field based on extraction is pre- to instruct as training set Practice short picture text classifier, wherein the tfidf formula are as follows:
tfidfi,j=tfi,j×idfi
Andni,jExpression appears in the frequency of i-th of word in jth piece article;
| D | indicate the text sum in training set;
It will be corresponding to the picture number of the short picture text, the comment feature of the short picture text and the short picture text Characteristic set of the ability label that user has as short picture text classifier, the short picture text point that pre-training is determined Class device optimizes training, trains the short picture text classifier for short picture text classification.
8. the short picture text identification device based on social networks as claimed in claim 7, which is characterized in that the grouping sheet Member is further specifically used for for the short picture text, using described in short picture text classifier determination of the optimization after trained The fields are determined as the corresponding ability label of the short picture text by the fields of short picture text.
CN201510907490.8A 2015-12-09 2015-12-09 A kind of short picture text recognition method and device based on social networks Active CN105528618B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510907490.8A CN105528618B (en) 2015-12-09 2015-12-09 A kind of short picture text recognition method and device based on social networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510907490.8A CN105528618B (en) 2015-12-09 2015-12-09 A kind of short picture text recognition method and device based on social networks

Publications (2)

Publication Number Publication Date
CN105528618A CN105528618A (en) 2016-04-27
CN105528618B true CN105528618B (en) 2019-06-04

Family

ID=55770832

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510907490.8A Active CN105528618B (en) 2015-12-09 2015-12-09 A kind of short picture text recognition method and device based on social networks

Country Status (1)

Country Link
CN (1) CN105528618B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107545271B (en) * 2016-06-29 2021-04-09 阿里巴巴集团控股有限公司 Image recognition method, device and system
CN107133258A (en) * 2017-03-22 2017-09-05 重庆允升科技有限公司 A kind of data based on selective ensemble grader label method
CN108230171A (en) * 2017-12-26 2018-06-29 爱品克科技(武汉)股份有限公司 One kind is based on timing node LDA theme algorithms
CN112699643B (en) * 2020-12-23 2024-04-19 车智互联(北京)科技有限公司 Method for generating language model and automatic article generation method
CN112926569B (en) * 2021-03-16 2022-10-18 重庆邮电大学 Method for detecting natural scene image text in social network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102193966A (en) * 2010-03-01 2011-09-21 微软公司 Event matching in social networks
CN104951542A (en) * 2015-06-19 2015-09-30 百度在线网络技术(北京)有限公司 Method and device for recognizing class of social contact short texts and method and device for training classification models

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010116333A1 (en) * 2009-04-07 2010-10-14 Alon Atsmon System and process for builiding a catalog using visual objects

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102193966A (en) * 2010-03-01 2011-09-21 微软公司 Event matching in social networks
CN104951542A (en) * 2015-06-19 2015-09-30 百度在线网络技术(北京)有限公司 Method and device for recognizing class of social contact short texts and method and device for training classification models

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于标签的微博用户兴趣发现算法研究及应用;康海潇;《中国优秀硕士学位论文全文数据库》;20140115(第01期);第15-18页

Also Published As

Publication number Publication date
CN105528618A (en) 2016-04-27

Similar Documents

Publication Publication Date Title
CN105528618B (en) A kind of short picture text recognition method and device based on social networks
CN112313642A (en) Intent recognition for agent matching by assistant system
CN109919316A (en) The method, apparatus and equipment and storage medium of acquisition network representation study vector
CN110297912A (en) Cheat recognition methods, device, equipment and computer readable storage medium
Nycz Second dialect acquisition: Implications for theories of phonological representation
JP6402408B2 (en) Tag processing method and tag processing apparatus
CN105373531B (en) A kind of short topic text recognition method and device based on social networks
Cadima et al. Examining teacher–child relationship quality across two countries
US20160246945A1 (en) System and method for weighting manageable patient attributes during criteria evaluations for treatment
CN106469192A (en) A kind of determination method and device of text relevant
US11321530B2 (en) Interpreting a meaning of a word string
CN109903086A (en) A kind of similar crowd's extended method, device and electronic equipment
JP6307822B2 (en) Program, computer and training data creation support method
CN107239680B (en) A kind of couple of user logs in the method and device for carrying out risk assessment
CN111222327B (en) Word embedding representation method, device and equipment
CN109325171A (en) User interest analysis method and system based on domain knowledge
Foote et al. Songs of the Eastern Phoebe, a suboscine songbird, are individually distinctive but do not vary geographically
Blanchard et al. Associations between social media, adolescent mental health, and diet: A systematic review
Hezarjaribi et al. Human-in-the-loop learning for personalized diet monitoring from unstructured mobile data
CN106294630B (en) Multimedia messages recommended method, device and multimedia system
CN111105117B (en) User information determining method and device
CN109902733A (en) The method, apparatus and storage medium of typing Item Information
CN109766825A (en) Handwritten signature identifying system neural network based
CN110162708A (en) Information output method, device, electronic equipment and computer readable storage medium
CN104850606B (en) Method for summarizing social events in mobile crowd sensing

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant