CN108520012A - Mobile Internet user comment method for digging based on machine learning - Google Patents

Mobile Internet user comment method for digging based on machine learning Download PDF

Info

Publication number
CN108520012A
CN108520012A CN201810233877.3A CN201810233877A CN108520012A CN 108520012 A CN108520012 A CN 108520012A CN 201810233877 A CN201810233877 A CN 201810233877A CN 108520012 A CN108520012 A CN 108520012A
Authority
CN
China
Prior art keywords
comment
user
data
comment data
classes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810233877.3A
Other languages
Chinese (zh)
Other versions
CN108520012B (en
Inventor
张莉
黄新越
蒋竞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201810233877.3A priority Critical patent/CN108520012B/en
Publication of CN108520012A publication Critical patent/CN108520012A/en
Application granted granted Critical
Publication of CN108520012B publication Critical patent/CN108520012B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention proposes a kind of mobile Internet user comment method for digging based on machine learning, belongs to requirement engineering and Data Mining.The present invention includes:The selecting of step 1 Focus Area and labeled data, the formulation of step 2 Questions types, step 3 application program is analyzed compared with thinking with data, step 4 pre-process the data in step 2 and three, step 5 sets a property for Application Type, step 6 establishes a binary classifier to each Questions types.The method of the present invention enriches the feature that grader uses by the addition of data attribute, solves the problems, such as existing data nonbalance to a certain extent by cost-sensitive meta classifier, pass through the Rational Parameters configuration optimization of the support vector machines effect of grader, improve the accuracy of comment classification, the personal needs of user can be flexibly met, data mining effect is better than current best user comment sorting technique.

Description

Mobile Internet user comment method for digging based on machine learning
Technical field
The present invention relates to requirement engineerings and Data Mining, and in particular, to it is a kind of it is based on machine learning, to movement The method that interconnection user on the network excavates the comment of software.
Background technology
Software requirement engineering is all indispensable part in the exploitation of software and evolutionary process, and the demand of being generally divided into obtains It takes, Requirements Modeling, form five requirement specification, requirements verification and demand management aspects.Wherein, software is used by collecting user The feedback information generated afterwards excavates all kinds of demands therein, there is important value for software developer.
With the development of Internet era, the acquisition modes of user feedback also become more diversified.Especially web2.0 After epoch, user-generated content (User Generated Content, abbreviation UGC) becomes novel user feedback money Source.Wherein, online comment of the user for software, be a kind of enormous amount, informative feedback data source, be UGC Typical Representative.User's online comment is usually direct demand that user independently sends out, to product, and content is more true, can Letter, there is stronger timeliness.
Before mobile applications prevalence, have many researchs for the excavation of the user comment on internet.Such as To the opinion mining of automobile evaluation, the system etc. that product advantage and disadvantage in the online customer evaluation of network are counted.Mobile Internet is emerging After rising, a large amount of online comment information is also produced in mobile terminal.Also, mobile applications (being commonly known as APP) are usually Have the characteristics that the development cycle is short and iteration is rapid;Meanwhile user group is more extensive, user demand is multifarious and changes Soon, field feedback is more abundant also more random.Recent decades have had a large amount of research works to attempt from text data Useful information is excavated, but mobile application user comment is short text for traditional text excavation, it is thus possible to need The short text different from traditional text digging technology is used to understand technology.It is excavated based on the demand that user comment is realized, for There is important value for software engineer.
Mobile applications distribution platform (Apple store, the shops Google Play etc.) allow user can easily search, Software application is purchased and installed, download is also very huge, monthly there are about 1,000,000,000 downloads only in Apple store.These platforms Allow user be download application submit feedback, marking and comment on all be disclose it is visible.If these feedbacks can be made good use of Information, they can become the channel that is exchanged with developer of user, help developer faster, more fully understand the demand of user, And it is taken in software iteration.
Many case studies all show in user feedback to include of great value information, such as error reporting, functional requirement With user experience etc.;For developer, the user comment in application market can help them to more fully understand user demand, Improve software quality.Also have and research and analyse and how comment to be carried out to automatically useful and useless classification, how from feedback Extract user demand etc..Research is it is also shown that since the quantity of user comment is very huge, and the organizational form of language is also very Freely, therefore, it is difficult to fully excavate the effective information in comment by the way of checking manually, needs with automation landform Formula excavates user comment.
Existing part research (reference can be made to existing file 1~3) discusses how to be several by the comment division of teaching contents of user Different types.User comment classification can be disclosed into user view, developer is helped to understand user demand.In this field, The classification type granularity that existing research has is thicker;Some sorting techniques do not make full use of comment attribute, Evaluated effect there is also The space that can be promoted.Therefore, currently there is more room for improvement in the processing of comment and sorting technique.
Bibliography:
[1]Maalej W,Nabil H.Bug report,feature request,or simply praiseon automatically classifying app reviews[C]//2015IEEE 23rd international requirements engineering conference(RE).IEEE,2015:116-125.
[2]Panichella S,Di Sorbo A,Guzman E,et al.How can i improve my app classifying user reviews for software maintenance and evolution[C]//Software Maintenance and Evolution(ICSME),2015IEEE International Conference on.IEEE, 2015:281-290.
[3]McIlroy S,Ali N,Khalid H,et al.Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews[J] .Empirical Software Engineering,2016,21(3):1067-1106.
Invention content
The features such as abundant, a large amount of, diversified for user comment in current mobile software distribution platform and exploit person Demand of the member to excavation user feedback, there are granularity of classification for existing method slightly, comment attribute utilizes the problems such as insufficient, the present invention It is proposed a kind of mobile Internet user comment method for digging based on machine learning.The method of the present invention preferably solves above ask Topic, improve comment classification accuracy so that developer faster, more fully understand user demand.
The present invention is based on the mobile Internet user comment method for digging of machine learning, include the following steps:
Step 1:The user comment for treating the application program of research field is sampled.
This method needs certain artificial labeled data collection as training set, and user can be according to the software oneself paid close attention to Field is sampled the user comment in application program.
Step 2:It determines the problem of including in user comment type, manually sampling comment is labeled, and mark is tied Fruit carries out inspection verification.
Sampling comment is labeled, namely marks the problem type belonging to each sample, it can be by examining one by one Comment obtains, and also can specify the problem of paying close attention to by user.
Step 3:Prepare the user comment data set of the application program to be analyzed, the comment of concern can be selected by user Data.
Step 4:The user comment data set that comment data collection and step 3 after being verified to step 2 obtain is into line number Data preprocess.Pretreatment includes participle, and word frequency vector is established using vector space model (VSM) and TF-IDF algorithms.
Step 5:The attribute of one identification application type is set, which represents two class application programs, Yi Leiying Service and content only are provided by developer with program, there is contacted with other people or enterprise by user in another kind of application program With exchange;The word of every comment in the user comment data set obtained for the comment data collection and step 4 of the mark after verification The attribute of the identification application type is added in frequency vector.
Step 6:One binary classifier is established to each Questions types, the comment data collection of the label after verification is made For training set, the user comment data set that step 3 is obtained utilizes the binary classifier of each Questions types as forecast set Classify.
In the step six, the binary classifier established uses linear SVM, and is added with cost-sensitive Meta classifier is classified by the way that different cost matrix values is arranged for cost-sensitive meta classifier, and selects effect most Excellent cost matrix.
The advantages and positive effects of the present invention are:(1) the method for the present invention is intuitive, simple, effective, for comment data Feature enriches the feature that grader uses, to a certain extent by cost-sensitive meta classifier by the addition of data attribute It solves the problems, such as existing data nonbalance, passes through the Rational Parameters configuration optimization of the support vector machines effect of grader, knot The above means are closed, effect of the present invention is finally made to increase than the effect of current best user comment sorting technique;This is sent out Bright method and the result of current the best way (bibliography [3]) are compared, the results showed that this method improves 14% Accuracy rate with 30% recall rate.(2) personal needs of user, either step 1 can be flexibly met in the method for the present invention Thinking and number of the middle Focus Area compared with being analyzed in the formulation of Questions types, step 3 in the selection of labeled data, step 2 According to all can freely being changed by the demand that user faces so that method can be generalized to diversified usage scenario.
Description of the drawings
Fig. 1 is the overall flow schematic diagram of the user comment method for digging the present invention is based on machine learning;
Fig. 2 is the comment sorting technique schematic diagram of the present invention;
Fig. 3 is that the present invention is applied to communicate the analysis result of social class application in 360 mobile phone assistant.
Specific implementation mode
The present invention is understood and implemented for ease of persons skilled in the art, in conjunction with the specific reality of the attached drawing description present invention Apply mode.
360 mobile phone assistant is one of most popular application shop in China, it is the Android applications run by Qihoo 360 Program distribution platform.By the total number of users in 2 months 2015, the application program shop be more than 800,000,000.It is reached using accumulation download To 64,000,000,000 times, average day abundance reaches 1.8 hundred million times.360 mobile phone assistant popularization degree is higher, wherein number of users considerably beyond The shops Google Play of China.In addition, user can capture the evaluation of APP in its website.360 mobile phone assistant is distributed Application program be free.In this application shop, the evaluation of user includes the date, grading (favorable comment, in comment, difference is commented), With comment content.And the content excavated is commented in being and is commented with difference.Because favorable comment is the praise to application program mostly, and comment can for middle difference To reflect that user wants the software issue complained.
The present invention comments the mobile interchange network users the present invention is based on machine learning using 360 mobile phone assistant as example platforms It is described in detail by specific steps P01~P06 of method for digging.
Step P01:User comment in application market is sampled.In 360 mobile phone assistant, application program is divided into 13 A classification, including:System and safety, communication and society, music and video, news and reading, life style, theme and wallpaper, Office and business, photography and video, shopping, map and tourism, education, finance, health and medical treatment.According to following standard 360 Application program is selected to analyze in mobile phone assistant:1. according to the whole ranking that market provides, select by July 3rd, 2016 The application program of each classification top ranked;2. deleting middle difference therein comments sum<100 application program.Finally, 11 are obtained Application program, as shown in table 1.
Then, the user comment under these application programs is crawled using reptile, calculates the quantity that middle difference therein is commented, and press Confidence level 95% and confidence interval 5% carry out random sampling to these comments.
Table 1:Application program sampling instances list
Apply Names Classification General comment number Middle difference comment number Sampling comment number
Mobile phone Taobao It does shopping preferential 100236 12594 373
Meitu Xiu Xiu Photography and vedio recording 89561 4313 353
Today's tops News is read 23522 1191 291
Alipay Finance and money management 20809 8808 369
Wechat Communication is social 111043 36227 381
Youku.com Audio-visual audiovisual 77830 17759 377
360 bodyguards System safety 81847 29373 380
360 desktops Theme wallpaper 22518 6881 364
360 cloud disks Office business 10246 3950 351
Ooze row Map is traveled 16620 1367 301
Operation is helped Education and study 82004 6234 362
Step P02:Determine in user comment comprising the problem of type, target be find comment in include have to developer The problem of meaning type.First, the problem of setting one is initially empty Category List, then artificially to each comment sample Check its content.It is the type by sample labeling if a certain type in sample content compliance problem list;If be not inconsistent It closes, then adds a kind of New raxa into problem list, restart to mark referring next to new list of requirements.Finally it can be obtained All problem types and the sample after label, as shown in table 2.Can comment rechecking is multiple during this, can also it subtract Few artificial error.
Table 2:Questions types list
Hand inspection verification again is carried out to the result of mark, reduces artificial mistake.
Step P03:Prepare the user comment data set that will be analyzed, such as specific type application, longitudinal comparison is not Complain the distribution situation of type and various complaint types with application (comparison for including different scoring sections and same scoring section) Accounting difference, summarize the type application Requirement Commonness and emphasis.
The present invention can select the application to be analyzed for communicating the application of social class by following procedure:
1. it is respectively a from high to low that certain class, which is applied according to download, under note " 360 mobile phone assistant "1, a2, a3... an;It is right The download answered is denoted as d1, d2... dn
2. a is applied in selection1, a2, a3... ak, meet condition
3. crawling aiSeveral comments, including favorable comment, in comment and difference is commented, be 10 according to every favorable comment weights, in comment weights It is 5, it is 1 that difference, which comments weights, obtains each comment weights read group total using aiScoring si;I=1,2 ..., k;
4. 9 points of selection is above and 7 points of following two scoring sections, rank forefront 5 sections of 5 of selection download are answered in respective section It is used as research object.
After the application being analysed to chooses, that is, crawl first 2000 analyses commented on after being used for of these applications.
In addition to this it is possible to be compared analysis to different types of application, comment reflection user demand is complained in observation The universal existence of this phenomenon;Feedback effects of the user to the iteration situation of a certain application are analyzed by node of version updating, Evaluation is made to pervious iteration, is made prediction to iteration from now on;Etc..
Step P04:The user comment data in comment data collection and step P03 to the mark after step P02 verifications Collection carries out data prediction.Pretreatment include participle, using vector space model (VSM) and TF-IDF algorithms come establish word frequency to Amount.
Since compared with English text, Chinese text segmentation is the basis of Chinese information processing, and meeting after text segmentation is easy In computer disposal and understand information, therefore needs to be segmented first.The present invention is using stammerer participle to the content of text of comment It is segmented, it is an efficient Python participles component.Single number and non-Chinese character need to be deleted, because Lack useful information;But stop words needs to retain, because some of which is significant for determining problem types, such as " Should not ".
Vector space model (VSM) is to be suitble to the text representation model of Large Scale Corpus.In the model, text space It is considered as the vector space being made of one group of orthogonal eigenvectors.Each dimension of vector corresponds to the feature in text, and And each dimension itself indicates the weight of character pair in text.TF-IDF algorithms are the common method for calculating weight, TF tables Show that word frequency, IDF indicate reverse document-frequency.The main thought of TF-IDF algorithms is, if word or phrase frequently occur on one In document/one kind document sets, the frequency TF high of appearance, and it is rarely found in other documents, then the word or phrase are considered having There is good classification capacity.The present invention is come using the String To Word Vector classes that Data Mining Tools packet WEKA is provided Word frequency vector is built, such realizes TF-IDF algorithms.In addition, to occurring carrying out less than words three times in data set Filter, this filtering can eliminate rare misspelling.String To Word Vector classes
By the pretreatment of this step, every comment is indicated with a word frequency vector.
Step P05:It adds and belongs to for the user comment data set in the comment data collection and step P03 of the mark after verification Property, the attribute distinguished only provide the mobile application of service by developer and user will be contacted with other people, enterprise etc. with The mobile application of exchange.Specific to the application of step P03 selections, it can be divided into two classes as shown in table 3.
Table 3:Application program categorical attribute
Classification 1 Classification 2
Alipay Ooze row
360 desktops Mobile phone Taobao
Meitu Xiu Xiu Wechat
Today's tops Youku.com
Operation is helped
360 bodyguards
360 cloud disks
This attribute will be added to the WEKA word frequency of every comment in the form of Category Attributes (nominal attribute) In vector.
The research of previous user comment classification only considers basis of the text message of comment as classification, and the present invention is also Consider that application is service and content to be provided separately by developer or user will be with other people, enterprise interacts, this will cause Comment on some differences of content.For example, the application that row is an online booking taxi and private car is oozed, its some use Family comment lays particular emphasis on the service of driver's offer, rather than software function.But for as 360 cloud disks (providing cloud storage service) Application program, the comment of user does not include and other people or the relevant content of other business.Application program is divided by the present invention Two classes:A kind of application program only provides service and content by developer, another kind of, and there is contacted with other people or enterprise With exchange.This attribute is added to the comment data collection after verification and the user comment data set in step 5.Specifically, will The attribute of mark application program is added in the word frequency vector of every comment.Enrich what grader used by the attribute of addition Feature, and make the Result for being directed to the comment data of different application more preferable.
Step P06:Using the comment data collection of the label after verification as training set, the user during step P03 is obtained comments By data set as forecast set, binary classification is carried out to each Questions types.
One binary classifier is established to each Questions types.One user comment may include multiple problem types, because This need to build multiple binary classifiers.Binary classifier uses support vector machines (SVM), and feature quantity few in sample size non- In the case of often big, select Non-linear Kernel usually inaccurate, may mistakenly divide feature space.For Optimum Classification effect, Using Linear SVM as grader.Simultaneously as the quantity of some Questions types negative samples is much larger than positive sample, this data Unbalanced situation may cause grader to be more likely to new samples being predicted as negative sample, and therefore, the present invention is quick using cost Learning method is felt to handle this problem, that is, adding cost-sensitive meta classifier for binary classifier, and rational generation is set Valence matrix parameter.Weight is assigned again to data according to different mistake point costs, when the generation that positive sample is predicted as to negative sample When valence is higher, it is increased by the weight of positive sample.
When realizing, the embodiment of the present invention is to each Questions types, the support provided using Data Mining Tools packet WEKA Vector machine and CostSensitiveClassifier classes establish binary classifier.It is set to be linearly to support for SMO class arrange parameters Vector machine is arranged different cost matrixes and finds optimal classification effect for the CostSensitiveClassifier classes of WEKA. The present invention passes through the Rational Parameters configuration optimization of the support vector machines effect of grader.
As shown in Fig. 2, after step P05 increases attribute:1. authenticated user comment is divided into training data and test Data are used as meta classifier, and the support provided using WEKA by the WEKA CostSensitiveClassifier classes provided Vector machine realizes the default parameters of SMO classes, obtains disaggregated model and its effect.Herein, by for Different values is arranged in the cost matrix provided needed for CostSensitiveClassifier, will obtain different Evaluated effects. After multiple study, cost matrix value when selecting effect optimal is as parameter when predicting below.2. will be authenticated before User comment be integrally used as training set, the user comment data set in step 5 is as forecast set, most with effect noted earlier Cost matrix value when excellent is used as member classification as parameter by the WEKA CostSensitiveClassifier classes provided Device, and using the default parameters of the WEKA support vector machines realization SMO classes provided, disaggregated model is established, and obtain result.It is each A Questions types are performed both by the above operation.
By taking the analysis target described in step P03 as an example, the result obtained by this method is as shown in Fig. 3.It can be seen that 1. needle To the social class application of communication, highest customer problem accounting is replacement problem, i.e., the demand that user applies the type focuses mostly on Experience in the updated is bad, in addition, collapse complains to be also the problem of needing concern with functional;2. more different scoring sections Each demand be averaged accounting numerical value it can be found that cast aside among the above apply common problem outside, one group of relatively low scoring is answered Complain used in functionality, need to increase characteristic, network problem and response time etc. performance are not so good as the application of high scoring, The developer of these applications should make improvement in these areas.The method of the present invention (is referred to using current the best way Document [3]) result compared, the results showed that this method improves 14% accuracy rate and 30% recall rate.
The method of the present invention is in a manner of intuitive, simple, effective, it is proposed that a kind of user comment excavation based on machine learning Method keeps method effect excellent by the reasonable disposition of addition, cost-sensitive meta classifier and support vector machines to data attribute In existing user comment sort research;And the personal needs of user can be flexibly met, neck is paid close attention to either in step 1 Domain in the formulation of Questions types, step 3 in the selection of labeled data, step 2 to application program analyze compared with thinking with Data all can freely be changed by the demand that user faces so that method can be generalized to it is diversified use field Scape.

Claims (5)

1. a kind of mobile Internet user comment method for digging based on machine learning, includes the following steps:
Step 1:The user comment for treating the application program of research field is sampled;
Step 2:Determine in user comment comprising the problem of type, manually to sampling comment being labeled, and to annotation results into Row checks verification;
Step 3:Obtain the comment data collection of application program to be analyzed;
Step 4:The comment data collection that mark comment data collection and step 3 after being verified to step 2 obtain pre-processes, Pretreatment includes:Participle establishes word frequency vector using vector space model and TF-IDF algorithms;TF indicates that word frequency, IDF indicate Reverse document-frequency;
It is characterized in that,
Step 5:The attribute of one identification application type is set, which represents two class application programs, and one kind applies journey Sequence only provides service and content by developer, and there is contacting and handing over other people or enterprise by user in another kind of application program Stream;For the mark after verification comment data collection and step 4 obtain user comment data set in every comment word frequency to The attribute of the identification application type is added in amount;
Step 6:One binary classifier is established to each Questions types, the comment data collection after step 2 is verified is as instruction Practice collection, the user comment data set that step 3 is obtained is carried out as forecast set using the binary classifier of each Questions types Classification;
In the step six, the binary classifier established uses linear SVM, and added with cost-sensitive member point Class device is classified by the way that different cost matrix values is arranged for cost-sensitive meta classifier, and selects effect optimal Cost matrix.
2. according to the method described in claim 1, it is characterized in that, in the step four, is segmented using stammerer, deleted simultaneously Single number and non-Chinese character, and retain stop words.
3. according to the method described in claim 1, it is characterized in that, in the step four, when building word frequency vector, filter Fall to concentrate in comment data and occur less than words three times.
4. according to the method described in claim 1, it is characterized in that, in the step six, to each Questions types, number is used Binary classifier is established according to the digging tool packet WEKA support vector machines provided and CostSensitiveClassifier classes.
5. method according to claim 1 or 4, which is characterized in that in the step six, to each Questions types, into The following operation of row:1. the comment data collection after verification is divided into training data and test data, provided by WEKA CostSensitiveClassifier classes realize the silent of SMO classes as meta classifier, and using the support vector machines that WEKA is provided Recognize parameter, obtains disaggregated model and its Evaluated effect;Wherein, by for cost needed for CostSensitiveClassifier classes The different value of arranged in matrix will obtain different Evaluated effects;After multiple study, cost matrix when selecting effect optimal Value is as parameter when predicting below;2. integrally regarding the comment data collection after verification as training set, the comment number of step 4 It is used as forecast set according to collection, parameter is used as using the cost matrix value of the effect that is obtained in 1. when optimal, WEKA offers are provided CostSensitiveClassifier classes realize the silent of SMO classes as meta classifier, and using the support vector machines that WEKA is provided Recognize parameter, establish disaggregated model, obtains the classification results of forecast set.
CN201810233877.3A 2018-03-21 2018-03-21 Mobile internet user comment mining method based on machine learning Active CN108520012B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810233877.3A CN108520012B (en) 2018-03-21 2018-03-21 Mobile internet user comment mining method based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810233877.3A CN108520012B (en) 2018-03-21 2018-03-21 Mobile internet user comment mining method based on machine learning

Publications (2)

Publication Number Publication Date
CN108520012A true CN108520012A (en) 2018-09-11
CN108520012B CN108520012B (en) 2022-02-18

Family

ID=63432916

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810233877.3A Active CN108520012B (en) 2018-03-21 2018-03-21 Mobile internet user comment mining method based on machine learning

Country Status (1)

Country Link
CN (1) CN108520012B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111736804A (en) * 2020-08-25 2020-10-02 南京大学 Method and device for identifying App key function based on user comment
CN114399709A (en) * 2021-12-30 2022-04-26 北京北大医疗脑健康科技有限公司 Child emotion recognition model training method and child emotion recognition method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049759A (en) * 2012-12-14 2013-04-17 上海邮政科学研究院 Postal code recognition method for postal sorting system
US20150378986A1 (en) * 2014-06-30 2015-12-31 Linkedln Corporation Context-aware approach to detection of short irrelevant texts
CN105550269A (en) * 2015-12-10 2016-05-04 复旦大学 Product comment analyzing method and system with learning supervising function
CN105574003A (en) * 2014-10-10 2016-05-11 华东师范大学 Comment text and score analysis-based information recommendation method
CN105740366A (en) * 2016-01-26 2016-07-06 哈尔滨工业大学深圳研究生院 Inference method and device of MicroBlog user interests
CN105787662A (en) * 2016-02-25 2016-07-20 西北工业大学 Mobile application software performance prediction method based on attributes
CN106202481A (en) * 2016-07-18 2016-12-07 量子云未来(北京)信息科技有限公司 The evaluation methodology of a kind of perception data and system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049759A (en) * 2012-12-14 2013-04-17 上海邮政科学研究院 Postal code recognition method for postal sorting system
US20150378986A1 (en) * 2014-06-30 2015-12-31 Linkedln Corporation Context-aware approach to detection of short irrelevant texts
CN105574003A (en) * 2014-10-10 2016-05-11 华东师范大学 Comment text and score analysis-based information recommendation method
CN105550269A (en) * 2015-12-10 2016-05-04 复旦大学 Product comment analyzing method and system with learning supervising function
CN105740366A (en) * 2016-01-26 2016-07-06 哈尔滨工业大学深圳研究生院 Inference method and device of MicroBlog user interests
CN105787662A (en) * 2016-02-25 2016-07-20 西北工业大学 Mobile application software performance prediction method based on attributes
CN106202481A (en) * 2016-07-18 2016-12-07 量子云未来(北京)信息科技有限公司 The evaluation methodology of a kind of perception data and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JIAWEI HE 等: "Recommendations based on LDA Topic Model in Android Applications", 《2016 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111736804A (en) * 2020-08-25 2020-10-02 南京大学 Method and device for identifying App key function based on user comment
CN111736804B (en) * 2020-08-25 2020-12-22 南京大学 Method and device for identifying App key function based on user comment
CN114399709A (en) * 2021-12-30 2022-04-26 北京北大医疗脑健康科技有限公司 Child emotion recognition model training method and child emotion recognition method

Also Published As

Publication number Publication date
CN108520012B (en) 2022-02-18

Similar Documents

Publication Publication Date Title
CN108021929B (en) Big data-based mobile terminal e-commerce user portrait establishing and analyzing method and system
Kühl et al. Supporting customer-oriented marketing with artificial intelligence: automatically quantifying customer needs from social media
WO2019214245A1 (en) Information pushing method and apparatus, and terminal device and storage medium
Gu et al. " what parts of your apps are loved by users?"(T)
CN104573054B (en) A kind of information-pushing method and equipment
WO2021098648A1 (en) Text recommendation method, apparatus and device, and medium
CN107862022B (en) Culture resource recommendation system
Gozhyj et al. Uniform Method of Operative Content Management in Web Systems.
CN110019616B (en) POI (Point of interest) situation acquisition method and equipment, storage medium and server thereof
CN108776671A (en) A kind of network public sentiment monitoring system and method
CN111831802B (en) Urban domain knowledge detection system and method based on LDA topic model
US20160048587A1 (en) System and method for real-time dynamic measurement of best-estimate quality levels while reviewing classified or enriched data
US10002187B2 (en) Method and system for performing topic creation for social data
WO2013170344A1 (en) Method and system relating to sentiment analysis of electronic content
CN103544188A (en) Method and device for pushing mobile internet content based on user preference
CN105787025A (en) Network platform public account classifying method and device
Mousavi Nejad et al. Establishing a strong baseline for privacy policy classification
CN115098650B (en) Comment information analysis method based on historical data model and related device
CN111444304A (en) Search ranking method and device
US9996529B2 (en) Method and system for generating dynamic themes for social data
US20220121668A1 (en) Method for recommending document, electronic device and storage medium
CN110941702A (en) Retrieval method and device for laws and regulations and laws and readable storage medium
US20130346385A1 (en) System and method for a purposeful sharing environment
CN113742496B (en) Electric power knowledge learning system and method based on heterogeneous resource fusion
Kochuieva et al. Usage of Sentiment Analysis to Tracking Public Opinion.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant