CN109146625B

CN109146625B - Content-based multi-version App update evaluation method and system

Info

Publication number: CN109146625B
Application number: CN201810921293.5A
Authority: CN
Inventors: 陶良乐; 陈湘萍; 周凡
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2018-08-14
Filing date: 2018-08-14
Publication date: 2022-04-26
Anticipated expiration: 2038-08-14
Also published as: CN109146625A

Abstract

The embodiment of the invention discloses a content-based multi-version App updating evaluation method and a content-based multi-version App updating evaluation system, wherein the method comprises the following steps: acquiring information and storing the information in a database by automatically traversing App; acquiring information in a database, analyzing and identifying the information, and acquiring differences of App of different versions; preprocessing relevant information such as comment information and comment time of the App to obtain modified user comment information corresponding to each version of App; and acquiring the user comment information corresponding to each version of App after modification, and performing comparative scoring processing by combining the differences of different versions of App to acquire the updated comprehensive emotion analysis numerical value of each App. By implementing the embodiment of the invention, more comprehensive feedback can be provided for developers, and the working efficiency is improved; it also provides information to the developer about the lifecycle of a particular function.

Description

Content-based multi-version App update evaluation method and system

Technical Field

The invention relates to the technical field of content identification and content comparison, in particular to a content-based multi-version App updating evaluation method and system.

Background

In mobile application development, updating of App versions is a common occurrence. Each time an update is made, the application publisher publishes an update log to account for the major changes made by the new version. However, the description of the part is only about the main modified content of the App or the highlighted function. This part of the content is not sufficient and there are also many modified contents that are not represented on the update log. Written in the update log of e.g. the Taobao version 7.11 App: the message initial page is upgraded, and partial bugs are repaired, the descriptions are quite extensive and are not specific enough, and only the update log can be used as a standard for version update in the conventional software analysis. The content is small, and the information is not specific enough, which causes that it is time-consuming and labor-consuming to filter effective update log information. It would be of great value if all differences between the two versions could be identified, including added content, deleted content.

However, most of the research on the version update problem focuses on mining the comments of the users. Such as: and tracking the user comments on line, and identifying the problems mentioned in the user comments for many times at the time node of version updating so as to obtain the new problems in the software. For example, by analyzing App user comments, the emotional tendency degree of the user comments is obtained. However, such studies only analyze emotional tendency of user comments and do not evaluate the updated content of the App. By combining the research, the existing method for evaluating the multi-version updated content is found to have many comments, and when the App update is evaluated, only the content in the update log is taken, and then the updated content is evaluated or recommended in the user comment. It is believed to be more valuable to developers if all differences between the two versions can be identified, including added content, deleted content, and combined with user comments to rate App updated content.

The prior art relates to a method for computing mobile application similarity based on content. The method comprises the following steps: after acquiring a large amount of mobile application information, extracting the mobile application information, wherein the mobile application information comprises an application name, an application type, an application description, an application size and the like; performing word segmentation on the application description information; dividing the content after word segmentation into two parts, integrating one part of the content to be used as a training corpus of a word2vec model, storing the other part of the content to be in a document set form, calculating TF-IDF, and storing the result in an HBase data warehouse; and carrying out App similarity query and calculation. The method for calculating the similarity of the mobile application based on the content has the following beneficial effects: the similarity query of the App can be quickly responded, the App can be well represented based on the App characteristics and the description information of the content, the accuracy is high, and the searching and recommending accuracy of the App can be improved.

The method has the application scene that similar apps are searched, the content of the apps is not analyzed, the apps can be represented only through the characteristics and the description information of the apps based on the content, and dynamic information of the apps at the running time is absent. This approach is efficient and low cost in finding a function, named close to App. But it cannot recognize the differences between versions for different versions of the same App.

There is also a technique of: an Android application permission reasoning method and device based on user comments relates to an application permission reasoning method and device, belongs to the technical field of information safety, and particularly relates to an Android application permission reasoning method and device based on user comments. The method and the device are used for mining the functional characteristics of the application program from the user comments of the application market, establishing the relation between the functional characteristics of the application program and the authority of the application program, and reasoning the authority request of the application program from the perspective that the user can understand the function of the application program and the perspective of the user on the safety and the privacy of the application program.

The technology aims at application authority reasoning, and information which can be obtained in the comment is much less than information of App functions and updating.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, and provides a content-based multi-version App updating evaluation method and system, which are used for analyzing user comments to obtain the problems of more users and evaluating and analyzing the problems; the user feedback problem is mostly the problem existing in the version, is not the evaluation aiming at the updated content, and cannot obtain the evaluation of the App updated content; all differences of the App are extracted, comments of the user are screened, and the screened comments are subjected to sentiment analysis, so that the score of the user for the updated content is obtained, and the App can be evaluated more comprehensively.

In order to solve the above problems, the present invention provides a content-based multi-version App update evaluation method, including:

acquiring App information and storing the App information in a database by automatically traversing App;

acquiring information in a database, analyzing and identifying the information, and acquiring differences of App of different versions;

the method comprises the steps of obtaining relevant information such as comment information and comment time of an App application store, preprocessing the relevant information, and obtaining user comment information corresponding to each version of App after modification;

and acquiring the user comment information corresponding to each version of App after modification, and performing comparative scoring processing by combining the differences of different versions of App to acquire the updated comprehensive emotion analysis numerical value of each App.

Preferably, the specific steps of acquiring and storing the information in the database include:

obtaining App package names and version numbers through a static analysis technology, grouping the App package names and version numbers, and storing obtained App data into a database;

and traversing the App, dynamically acquiring App information, and storing the App information in a non-structural database.

Preferably, the specific steps of acquiring the information in the database for analysis and identification processing include:

acquiring the App data, selecting different versions of the same App, making a list of different versions of the App, compiling a program through java language, and selecting one group, namely the list of all different versions of the same App;

acquiring a list of all different versions of the same App, and selecting Apps of two adjacent versions to extract interface information of the two Apps;

acquiring interface information of the two Apps, and comparing the contents of the interface information of the two Apps to obtain two interfaces with similarity;

and acquiring the two interfaces with similarity, and identifying all visible characters in the two interfaces with similarity to obtain the content with difference in the interfaces.

Preferably, the specific steps of acquiring the interface information of the two apps and comparing the contents of the two apps include:

positioning interfaces with the same PageId in the old version, extracting interfaces of the new version and the old version with the same PageId, comparing the interface contents, and obtaining the unmatched interface attributes in the updated new version;

defining a comp value to measure the content similarity of two interfaces, positioning the interfaces with the same activity through App related attributes, comparing, performing word segmentation processing on content for English texts, and then calculating the similarity of English words, wherein the specific formula is as follows:

IC(w)＝-log(p(w)),

where p (w) is the frequency of occurrence of the word w in WordNet Sense, LCS (w)₁,w₂) Is the word w₁And w₂Nearest common ancestor of (1), similarity of two words, similar (w)₁,w₂) The sum of the information content of the two words divided by twice the information content of the nearest common ancestor of the two words.

For Chinese text, word segmentation is performed first, and then text similarity comparison is performed.

When comp is 1.0, judging that the two interfaces are completely the same interface, and recording interface information;

when comp is 0.0, judging that the two interfaces are completely different interfaces, and recording interface information;

when 0.0 < comp < 1.0, two interfaces having a certain similarity are determined.

Preferably, the specific step of performing recognition processing on all visible characters in the two interfaces with similarity includes:

acquiring the two interfaces with similarity, extracting and processing to obtain all control attributes of the interfaces;

defining the set of all visible text contents in the attribute of each control as T

Because of the two interfaces, two multidimensional vectors are defined:

T_i＝{text₁,text₂,...,text_n}

T_j＝{text₁,text₂,...,text_m}

T_irepresenting the set of all visible text contents in the new version, T_jRepresenting the collection of all visible textual content in the old version.

Set T for the two interfaces_iAnd T_jPerforming identification processing if T_iChinese text_kCan be at T_jIf the new version content can be found to be the same as the old version, the new version content is judged to be text_kIs a repeated content; if T is_iChinese text_kCan not be at T_jIf the new version content is found, that is, the new version content is not found to be the same in the old version, the text is judged to be text_kIs an added content; if T is_jChinese text_lCan not be at T_iIf the old version content is found in the new version, that is, the old version content is not found in the new version, the text is judged to be text_lIs the deleted content;

the identified difference content is retrieved and stored in a database.

Preferably, the obtaining of the comment information and comment time of the App store and other related information further includes: the update log of the software, the current version number of the software and the release time.

Preferably, the specific step of acquiring relevant information such as comment information and comment time of the App application store for preprocessing includes:

correspondingly retrieving an App version number and release time according to the time of the user comment;

obtaining all user comment data of a certain App by using a crawler tool;

the comments in different forms are merged into a basic form which is not in the original form by utilizing a part-of-speech reduction and stem extraction method;

and obtaining comment information with a basic form, and performing filtering word stopping processing by using the NLTK to obtain the modified user comment information corresponding to each version of App.

Preferably, the specific steps of performing comparative scoring processing in combination with differences of apps of different versions include:

acquiring all the added and deleted contents of the Apps, recording, and acquiring all user comments corresponding to each App update;

acquiring all user comments corresponding to App updating, and respectively performing text sentiment analysis processing on the Chinese comments and the English comments to acquire sentiment analysis numerical values of the users;

and standardizing all obtained numerical values to be-1.0 values, presetting the updated content of each App to be a value of 60, and adding the standardized emotion analysis numerical value obtained by each comment to the original preset value to obtain the updated comprehensive emotion analysis numerical value of each App.

Correspondingly, the embodiment of the invention also provides a multi-version App updating evaluation system based on content, which comprises:

the collection storage module is used for acquiring information and storing the information into a database;

the analysis module is used for acquiring information in the database to perform analysis and identification processing to obtain differences of App of different versions;

the device comprises a preprocessing module, a display module and a display module, wherein the preprocessing module is used for preprocessing related information such as comment information and comment time of an App application store to obtain modified user comment information corresponding to each version of App;

and the grading module is used for comparing and grading the modified user comment information corresponding to the App of each version according to the difference of the Apps of different versions to obtain the updated comprehensive emotion analysis value of each App.

Further, the collection storage module includes:

the analysis grouping unit is used for acquiring the name and the version number of the App package and grouping the App package;

the dynamic recording unit is used for traversing all apps and recording screen snapshots and UI hierarchical structure trees of each UI (user interaction interface); storing the process information of the system level and the application level, the path information of the method of each test input departure and modifying codes.

Further, the analysis module includes:

the form making unit is used for acquiring the App data, selecting different versions of the same App, making lists of different versions of the App, writing programs on eclipse through java language, and selecting one group, namely lists of all different versions of the same App;

the extracting unit is used for acquiring lists of all different versions of the same App and selecting the App with two adjacent versions to extract interface information of the two Apps;

the comparison unit is used for acquiring interface information of the two apps, comparing the contents of the two apps and acquiring two interfaces with similarity;

and the recognition unit is used for acquiring the two interfaces with similarity, and recognizing all visible characters in the two interfaces by using a natural language processing method to obtain the contents with the difference in the interfaces.

Further, the comparing unit further includes a discriminating unit:

when comp which is used for measuring the content similarity of the two interfaces is 1.0, the two interfaces are judged to be completely identical interfaces;

when comp which is used for measuring the content similarity of the two interfaces is 0.0, the two interfaces are judged to be completely different interfaces;

when the content similarity of the two interfaces is measured to be 0.0 < comp < 1.0, the two interfaces with certain similarity are judged.

Further, the preprocessing module further comprises:

the retrieval unit is used for correspondingly retrieving the App version number and the release time according to the time of the user comment;

the device comprises an obtaining comment unit, a judging unit and a judging unit, wherein the obtaining comment unit is used for obtaining all user comment data of a certain App, including Chinese comments and English comments;

the merging unit is used for merging the comments in different forms into the basic form of the original form;

and the filtering unit is used for acquiring the comment information in the basic form, filtering and stopping words, and acquiring the modified user comment information corresponding to each version of App.

Further, the scoring module further comprises:

the recording unit is used for acquiring all the added and deleted contents of the Apps, recording and acquiring all user comments corresponding to each App update;

the emotion analysis processing unit is used for acquiring all user comments corresponding to App updating, and performing text emotion analysis processing on the English comments and the Chinese comments respectively to acquire emotion analysis numerical values of the users;

and the emotion score calculation unit is used for standardizing all the obtained numerical values to be-1.0 values, presetting the updated content of each App to be a value of 60, and adding the standardized emotion analysis numerical value obtained by each comment to the original preset value to obtain the updated comprehensive emotion analysis numerical value of each App.

By implementing the embodiment of the invention, more comprehensive feedback can be provided for developers, and the working efficiency is improved; it also provides information to the developer about the lifecycle of a particular function.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flow chart of a content-based multi-version App update evaluation method according to an embodiment of the present invention;

fig. 2 is a schematic structural composition diagram of a content-based multi-version App update evaluation system according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 is a schematic flowchart of a method for evaluating an App update based on multiple versions of content according to an embodiment of the present invention, and as shown in fig. 1, the method includes:

s1, traversing App through the automatic traversal tool, acquiring information and storing the information in a database;

s2, acquiring information in the database, analyzing and identifying the information, and acquiring differences of App of different versions;

s3, relevant information such as comment information and comment time of the App application store is preprocessed through the crawler tool, and user comment information corresponding to each version of App after modification is obtained;

s4, obtaining the modified user comment information corresponding to each version of App, and performing comparison and scoring processing by combining the differences of different versions of App to obtain the updated comprehensive emotion analysis numerical value of each App.

Specifically, the specific steps of acquiring and storing information in the database include:

s11, obtaining App package names and version numbers through a static analysis technology, grouping the App package names and version numbers, obtaining App data, and storing the App data in a database;

and S12, traversing the App, dynamically acquiring App information, and storing the App information in the unstructured database.

Further explanation of S11 is:

in Eclipse, a java toolkit is used for carrying out static analysis on an App installation package, an apk file is obtained, and AppName is the name of App; AppVersion, i.e., App version number; different versions of the same type of App have the same AppName, and AppVersion has a sequence, and generally the AppVersion of a new version is larger. And recording the same AppName as one Group, marking the same Group by using a Group attribute, and arranging the same Group in a reverse order through an AppVersion attribute. And obtaining a list of all apps, and storing all information into a MongoDB database.

Further explanation of S12 is:

traversing all apps by adopting a black box technology, simulating by adopting an intelligent input generation tool to generate intelligent input, and recording a screen snapshot and a UI hierarchical structure tree of each UI (user interaction interface); and storing the process information of the system level and the application level and the method path information of each test input departure. And modify the code and store all information in the unstructured database.

Specifically, the specific steps of acquiring information in the database for analysis and identification processing include:

s21, acquiring the App data, selecting different versions of the same App, making lists of different versions of the App, writing programs on eclipse through java language, and selecting one group, namely lists of all different versions of the same App;

s22, acquiring lists of all different versions of the same App, and selecting the App with two adjacent versions to extract interface information of the two Apps;

s23, acquiring interface information of the two Apps, and comparing the contents of the interface information to obtain two interfaces with similarity;

and S24, acquiring the two interfaces with similarity, and identifying all visible characters in the two interfaces by using a natural language processing method to acquire the content with difference in the interfaces.

Further explanation of S22 is:

and the lists of all different versions of the same App are arranged in a reverse order according to the AppVersion attribute, and two adjacent versions are selected for comparison each time from the latest version. If there are n versions of App in the list, then n-1 sets are generated.

And positioning the information of the selected App of the two adjacent versions in a database through the AppName and the AppVersion attributes. In the new version App, an interface is selected, the PackageName and ActivinyName values are found, and a PageId attribute is formed. Then define PageId is the value of the positioning App interface, which is defined as:

PageId＝PackageName+ActivityName

where PackageName is the package name of the App and ActivityName is the name of the activity. Both of these attributes are already obtained in S12. For each new version of the interface, a list of interfaces corresponding to the new version of the interface can be found.

Further explanation of S23 is:

and positioning the interfaces with the same PageId in the old version, extracting the interfaces with the same PageId in the new version and the old version, and comparing the interface contents. After all interface comparisons for a PageId are completed, the PageId is updated to the interface attributes not compared in the new version.

Here, a comp value is defined to measure the similarity of the contents of the two interfaces. Interfaces with the same activity are positioned through related attributes of App, and are compared to screen out the interfaces with certain similarity.

Through the PageId and the interface number attribute thereof, the detailed information of each interface can be located and found in the database, and comp is defined as:

comp＝＜content_i,content_j＞(comp∈(0.0,1.0))

content is the collection of all the words in the interface. Comparing contents_iAnd content_jEnglish and Chinese texts are processed separately. For English text, firstly, the content is participled, and then the similarity of English words is calculated by using the already opened JWS (Java WordNet similarity) tool, and the specific formula is as follows:

IC(w)＝-log(p(w)),

For Chinese text, segmentation is carried out by using an Ansj tool, and then comparison of text similarity is carried out.

when comp is 0.0: judging that the two interfaces are completely different interfaces, and recording interface information;

when 0.0 < comp < 1.0: two interfaces having a certain similarity are judged.

Further explanation of S24 is:

the MongoDB database stores more specific information of two interfaces, wherein the more specific information comprises the attribute of each control, and the set of all visible text contents in the attribute of each control is defined as T

Because of the two interfaces, two multidimensional vectors are defined:

T_i＝{text₁,text₂,...,text_n}

T_j＝{text₁,text₂,...,text_m}

If T is_iChinese text_kCan be at T_jIf the new version content can be found to be the same as the old version, the new version content is judged to be text_kIs a repeated content; if T is_iChinese text_kCan not be at T_jIf the new version content is found, that is, the new version content is not found to be the same in the old version, the text is judged to be text_kIs an added content; if T is_jChinese text_lCan not be at T_iIf the old version content is found in the new version, that is, the old version content is not found in the new version, the text is judged to be text_lIs the deleted content. The identified difference content is then stored in a database.

Specifically, the step of obtaining relevant information such as comment information and comment time of the App store through the crawler tool in S3 further includes: the update log of the software, the current version number of the software and the release time.

Specifically, the specific step of acquiring relevant information such as comment information and comment time of the App application store through the crawler tool to perform preprocessing in S3 includes:

s31, correspondingly retrieving an App version number and release time according to the time of the user comment;

s32, acquiring all user comment data of a certain App by using a crawler tool;

s33, merging the comments in different forms into a basic form which is not in the original form by using a part-of-speech reduction and stem extraction method;

and S34, obtaining the comment information with the basic form, filtering and stopping words by using the NLTK, and obtaining the modified user comment information corresponding to each version of App.

In S32, the acquiring of all user comment data of an App includes: english comments can be obtained from Google application stores; the Chinese comments may be obtained from pea pods and millet application stores.

Specifically, the step of performing comparative scoring processing in combination with differences of apps of different versions in S4 includes:

s41, acquiring all the added and deleted contents of the Apps, recording, and acquiring all user comments corresponding to each App update;

s42, acquiring all user comments corresponding to App updating, performing text emotion analysis processing, and acquiring emotion analysis numerical values of users; for the English comments, analyzing and processing the English comments by using an emotion analysis dictionary through a supervision and classification algorithm to obtain emotion analysis values of the English comments; for Chinese comments, the NPL API is used for analyzing and processing to obtain an emotion analysis value.

S43, standardizing all obtained numerical values to be-1.0 values, presetting the updated content of each App to be a value of 60, and adding the standardized emotion analysis numerical value obtained by each comment to the original preset value to obtain the updated comprehensive emotion analysis numerical value of each App.

Correspondingly, an embodiment of the present invention further provides a content-based multi-version App update evaluation system, as shown in fig. 2, the system includes:

Further, the collection storage module includes:

Further, the analysis module includes:

Further, the comparing unit further includes a discriminating unit:

Further, the preprocessing module further comprises:

Further, the scoring module further comprises:

the emotion analysis processing unit is used for acquiring all user comments corresponding to App updating, performing text emotion analysis processing and acquiring emotion analysis numerical values of the users; for the English comments, analyzing and processing the English comments by using an emotion analysis dictionary through a supervision and classification algorithm to obtain emotion analysis values of the English comments; for Chinese comments, the NPL API is used for analyzing and processing to obtain an emotion analysis value.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.

In addition, the method and the system for evaluating the update of the multi-version App based on the content provided by the embodiment of the invention are described in detail, a specific example is applied in the text to explain the principle and the implementation of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A method for evaluating update of a multi-version App based on content is characterized by comprising the following steps:

the method comprises the steps of obtaining comment information and comment time of an App application store for preprocessing, and obtaining user comment information corresponding to each version of App after modification;

acquiring the modified user comment information corresponding to each version of App, and performing comparative scoring processing by combining the differences of different versions of Apps to acquire updated comprehensive emotion analysis numerical values of each App;

the method comprises the following steps of automatically traversing an App, acquiring App information and storing the App information in a database, and specifically comprises the following steps:

traversing all apps by adopting a black box technology, simulating by adopting an intelligent input generation tool to generate intelligent input, and recording a screen snapshot and a UI hierarchical structure tree of each UI; storing process information of a system level and an application level and path information of each test input starting method, modifying codes and storing all information into a non-structural database;

the information in the database is obtained for analysis and identification processing, and differences of apps of different versions are obtained, specifically:

acquiring App data in the database, selecting different versions of the same App, making a list of different versions of the App, writing a program through java language, and selecting one group, namely a list of all different versions of the same App; acquiring a list of all different versions of the same App, and selecting Apps of two adjacent versions to extract interface information of the two Apps; acquiring interface information of the two Apps, and comparing the contents of the interface information of the two Apps to obtain two interfaces with similarity; and acquiring the two interfaces with similarity, and identifying all visible characters in the two interfaces with similarity to obtain the content with difference in the interfaces.

2. The method for updating and evaluating the multiple versions of apps based on the content according to claim 1, wherein the specific steps of obtaining the interface information of the two apps and comparing the contents of the two apps comprise:

IC(w)＝-log(p(w))，

where p (w) is the frequency of occurrence of the word w in WordNet Sense, LCS (w)₁，w₂) Is the word w₁And w₂Nearest common ancestor of (1), similarity of two words, similar (w)₁，w₂) Dividing the sum of the information quantity of the two words by twice of the information quantity of the nearest common ancestor of the two words;

for Chinese text, word segmentation is carried out firstly, and then text similarity comparison is carried out;

3. The method for evaluating the update of the content-based multi-version App according to claim 1, wherein the specific steps of identifying all visible characters in the two interfaces with similarity comprise:

Because of the two interfaces, two multidimensional vectors are defined:

T_i＝{text₁，text₂，...，text_n}

T_j＝{text₁，text₂，...，text_m}

T_irepresenting the set of all visible text contents in the new version, T_jRepresenting the set of all visible text content in the old version;

set T for the two interfaces_iAnd T_jPerforming identification processing if T_iChinese text_kCan be at T_jIf the new version content can be found to be the same as the old version, the new version content is judged to be text_kIs a repeated content; if T is_iChinese text_kCan not be at T_jIf the new version content is found, that is, the new version content is not found to be the same in the old version, the text is judged to be text_kIs an added content; if T is_jChinese text_lCan not be at T_iIs found, i.e. the old version content is not found in the new versionIf the two are the same, the result is judged to be text_lIs the deleted content;

the identified difference content is retrieved and stored in a database.

4. The method for evaluating the update of the multi-version App based on the content as claimed in claim 1, wherein the specific step of acquiring the comment information and the comment time of the App application store for preprocessing comprises the following steps:

obtaining all user comment data of a certain App by using a crawler tool;

merging the comments in different forms into a basic form of an original form by utilizing a part-of-speech reduction and stem extraction method;

and obtaining comment information in a basic form, filtering and stopping words, and obtaining the modified user comment information corresponding to each version of App.

5. The method for evaluating the update of the multi-version App based on the content as claimed in claim 1, wherein the specific steps of comparing and scoring in combination with the differences of different versions of App comprise:

acquiring all user comments corresponding to App updating, and performing text sentiment analysis processing on English comments and Chinese comments respectively to obtain sentiment analysis values of the users;

6. A multi-version App update evaluation system based on content, the system comprising:

the device comprises a preprocessing module, a display module and a display module, wherein the preprocessing module is used for preprocessing comment information and comment time of an App application store to obtain modified user comment information corresponding to each version of App;

the grading module is used for comparing and grading the modified user comment information corresponding to each version of App according to the difference of different versions of App to obtain the updated comprehensive emotion analysis value of each App;

wherein the collection storage module comprises:

the dynamic recording unit is used for traversing all apps by adopting a black box technology, simulating by adopting an intelligent input generation tool to generate intelligent input, and recording a screen snapshot and a UI hierarchical structure tree of each UI; storing process information of a system level and an application level and path information of each test input starting method, modifying codes and storing all information into a non-structural database;

wherein the analysis module comprises:

the tabulation unit is used for acquiring App data in the database, selecting different versions of the same App, making a list of different versions of the App, writing a program on eclipse through java language, and selecting one group, namely a list of all different versions of the same App;

7. The content-based multi-version App update rating system of claim 6, wherein the scoring module further comprises:

the emotion analysis processing unit is used for acquiring all user comments corresponding to App updating, and performing text emotion analysis processing on the English comments and the Chinese comments to acquire emotion analysis values of the users;