CN113742557A - Method and device for recommending application program identification rules - Google Patents

Method and device for recommending application program identification rules Download PDF

Info

Publication number
CN113742557A
CN113742557A CN202110914747.8A CN202110914747A CN113742557A CN 113742557 A CN113742557 A CN 113742557A CN 202110914747 A CN202110914747 A CN 202110914747A CN 113742557 A CN113742557 A CN 113742557A
Authority
CN
China
Prior art keywords
rules
rule
application program
identified
identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110914747.8A
Other languages
Chinese (zh)
Inventor
吕慧
吴春山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shenyan Intelligent Technology Co ltd
Original Assignee
Beijing Shenyan Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shenyan Intelligent Technology Co ltd filed Critical Beijing Shenyan Intelligent Technology Co ltd
Priority to CN202110914747.8A priority Critical patent/CN113742557A/en
Publication of CN113742557A publication Critical patent/CN113742557A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9532Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0637Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Data Mining & Analysis (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for recommending application program identification rules, wherein the method comprises the following steps: acquiring a target uniform resource locator and a target user agent entity from operation flow for accessing an application program to be identified; disassembling the target uniform resource locator and the target user agent entity to obtain a plurality of rules; acquiring the flow track detail of an application program to be identified in a preset time period; determining the recognition rate and recognition accuracy rate of the application program recognized by adopting a plurality of rules according to the flow track detail; and recommending the rule which meets the preset condition in the rules as the target identification rule of the application program according to the identification rate and the identification accuracy rate corresponding to the rules. The method realizes automatic recommendation of the target identification rule of the app, greatly improves recommendation efficiency, and solves the technical problem of low efficiency caused by dependence of manual processing on the app identification rule in the related technology.

Description

Method and device for recommending application program identification rules
Technical Field
The invention relates to the technical field of app rules, in particular to a method and a device for recommending application program identification rules, a computer-readable storage medium and a processor.
Background
With the vigorous development of mobile internet and big data and the establishment of big data departments of various big operators in China, more data based on operators are applied to more and more enterprises to guide marketing decisions and operation decisions. It is an important and heavy task to restore the massive user internet log data to the app behavior of the user.
The current common scheme in the industry is that a demand party provides demand and use scenes, rule identification to a specific app level needs to be supported in the aspect of underlying technical capability, the demand party provides a specific app name and app package name list, a person with certain experience needs to install a corresponding app on a mobile phone and operate the app by using a tool, and the app related rules are extracted by manually capturing a package.
The industrial scheme mainly has the following defects:
1) the timeliness is poor: the demand side has the demand and completely depends on the processing of experienced workers, the efficiency is low, the processing timeliness is poor under the condition of schedule tension, and the demand cannot be processed in time.
2) The accuracy is not high: because rule identification completely depends on manual processing, errors may occur in the manually identified rules (for example, some confusing rules may occur in some software development toolkits sdk embedded in apps, and inexperienced personnel may identify the rules as the apps), although there is an auditing process, the auditing process currently depends on manual processing, and there is a certain subjective suggestion, and high-requirement accuracy cannot be guaranteed.
3) Poor flexibility and high updating cost of data rules: the updating timeliness of the app is high, the rule may be changed along with the updating of the app, historically, the app check of the approved confirmation rule is accumulated, a re-analysis process needs to be carried out, the flexibility is poor, and the data updating cost is high.
Aiming at the problem of low efficiency caused by dependence on manual processing of app identification rules in the related technology, no effective solution is provided at present.
Disclosure of Invention
The invention mainly aims to provide a method, a device, a computer readable storage medium and a processor for recommending an application program identification rule, so as to solve the technical problem of low efficiency caused by the fact that the app identification rule in the related technology depends on manual processing.
In order to achieve the above object, according to one aspect of the present invention, there is provided a method of recommending an application identification rule, including: acquiring a target uniform resource locator and a target user agent entity from operation flow for accessing an application program to be identified; the target uniform resource locator and the target user agent entity are disassembled to obtain a plurality of rules; obtaining the flow track detail of the application program to be identified in a preset time period; determining the recognition rate and the recognition accuracy rate of recognizing the application program by adopting the multiple rules according to the flow track details; and recommending the rules meeting preset conditions in the plurality of rules as target identification rules of the application program according to the identification rates and the identification accuracy rates corresponding to the plurality of rules.
Optionally, the parsing the target uniform resource locator and the target user agent entity to obtain a plurality of rules includes: identifying domain names at all levels, url directories at all levels and key words in the url directories at all levels in the target uniform resource locator; according to the domain names at all levels, the url directories at all levels and the keywords in the url directories at all levels, the target uniform resource locator is disassembled into a plurality of parts; determining the plurality of portions as a plurality of the rules.
Optionally, the parsing the target uniform resource locator and the target user agent entity to obtain a plurality of rules includes: identifying keywords in the target user agent entity; and decomposing the target user agent entity into a plurality of parts according to a preset rule according to the keyword in the target user agent entity, and determining the plurality of parts as a plurality of rules.
Optionally, the obtaining of the traffic track details of the application program to be identified in the preset time period includes: acquiring at least one operation flow on the application program to be identified and recording the operation flow as an operation log, wherein the operation log at least comprises the following contents: installing the device id of the application program to be identified, the name of the application program to be identified, the starting operation time of the operation pipeline and the ending time of the operation pipeline; collecting flow data corresponding to the application program to be identified, wherein the flow data at least comprises http data and https data; and associating the operation log and the flow data into the flow track detail.
Optionally, the collecting the flow data corresponding to the application program to be identified includes: establishing a unified access flow outlet of the application program to be identified; and collecting the flow data through the unified outlet.
Optionally, determining, according to the traffic trajectory details, the recognition rate and the recognition accuracy of the plurality of rules to the application program to be recognized includes: matching the plurality of rules with the content in the traffic track detail, and counting the successful matching times of each rule and the content; counting the number of the operation running water contained in the traffic trace detail; and calculating the recognition rate of each rule to the application program to be recognized according to the successful matching times of each rule and the number of the operation flow.
Optionally, determining, according to the traffic trajectory details, the recognition rate and the recognition accuracy of the plurality of rules to the application program to be recognized includes: counting the number of the application programs to be identified by each rule; and calculating the identification accuracy of each rule according to the number of the identified application programs to be identified.
Optionally, recommending the rule meeting the preset condition as the identification rule of the application program to be identified includes: filtering out first rules corresponding to the recognition rate lower than a first threshold value; filtering out a second rule corresponding to the identification accuracy rate lower than a second threshold value; and determining a rule which eliminates the first rule and the second rule as an identification rule of the application program to be identified.
In order to achieve the above object, according to another aspect of the present invention, there is provided an apparatus for recommending an application identification rule, including: the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a target uniform resource locator and a target user agent entity from operation flow for accessing an application program to be identified; a processing unit, configured to disassemble the target uniform resource locator and the target user agent entity to obtain a plurality of rules; the second acquisition unit is used for acquiring the flow track details of the application program to be identified in a preset time period; the first determining unit is used for determining the recognition rate and the recognition accuracy rate of recognizing the application program by adopting the multiple rules according to the flow track details; and the second determining unit is used for recommending the rule which meets the preset condition in the rules as the target identification rule of the application program according to the identification rate and the identification accuracy rate corresponding to the rules.
To achieve the above object, according to a further aspect of the present invention, there is provided a computer-readable storage medium including a stored program, wherein the program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform any one of the methods.
To achieve the above object, according to a further aspect of the present invention, there is provided a processor for executing a program, wherein the program executes to perform any one of the methods.
According to the method for recommending the application program identification rule, firstly, a target uniform resource locator and a target user agent entity are obtained from operation flowing water for accessing an application program to be identified, then the target uniform resource locator and the target user agent entity are disassembled to obtain a plurality of rules, then flow track details of the application program to be identified in a preset time period are obtained, then the identification rate and the identification accuracy rate for identifying the application program by adopting the plurality of rules are determined according to the flow track details, and finally the rule meeting preset conditions in the plurality of rules is recommended as the target identification rule of the application program according to the identification rate and the identification accuracy rate corresponding to the plurality of rules. According to the method, the uniform resource locator url and the user agent entity ua are disassembled to obtain the multiple rules, the recognition rate and the recognition accuracy of the multiple rule recognition apps are obtained through detailed calculation according to the flow track, the rule with the recognition rate and the recognition accuracy meeting the preset conditions can be selected and used as the target recognition rule of the apps for recommendation, automatic recommendation of the target recognition rule of the apps is achieved, recommendation efficiency is greatly improved, and the technical problem that the efficiency is low due to the fact that the app recognition rule in the related technology depends on manual processing is solved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate an embodiment of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:
FIG. 1 is a flow diagram of a method for recommending application identification rules, provided in accordance with an embodiment of the present invention;
FIG. 2 is a schematic diagram of identification rules for an app provided according to an embodiment of the invention;
fig. 3 is a schematic diagram of an apparatus for recommending application identification rules according to an embodiment of the present invention.
Detailed Description
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged under appropriate circumstances in order to facilitate the description of the embodiments of the invention herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
For convenience of description, some terms or expressions referring to the embodiments of the present invention are explained below:
url: a uniform resource locator;
ua: the user agent entity, the browser agent user and the user are helped to send the request, and the user-agent is carried in the request header when the browser sends the request to the http server to enable the server to guide the related information of the browser and enable the server to judge the operating system version and the browser version.
According to an embodiment of the present invention, a method of recommending application identification rules is provided.
FIG. 1 is a flow diagram of a method of recommending application identification rules, according to an embodiment of the present invention. As shown in fig. 1, the present invention comprises the steps of:
step S101, obtaining a target uniform resource locator and a target user agent entity from operation flow of accessing an application program to be identified;
step S102, the target uniform resource locator and the target user agent entity are disassembled to obtain a plurality of rules;
step S103, obtaining the flow track details of the application program to be identified in a preset time period;
step S104, determining the recognition rate and the recognition accuracy rate of recognizing the application program by adopting the plurality of rules according to the flow track details;
step S105, recommending the rule meeting the preset condition in the plurality of rules as the target identification rule of the application program according to the identification rate and the identification accuracy corresponding to the plurality of rules.
The method for recommending the application program identification rule includes the steps of firstly, obtaining a target uniform resource locator and a target user agent entity from operation flow of accessing an application program to be identified, then, disassembling the target uniform resource locator and the target user agent entity to obtain a plurality of rules, then, obtaining flow track details of the application program to be identified in a preset time period, then, determining identification rate and identification accuracy rate of the application program identified by the rules according to the flow track details, and finally, recommending the rule meeting preset conditions in the rules as the target identification rule of the application program according to the identification rate and the identification accuracy rate corresponding to the rules. According to the method, the uniform resource locator url and the user agent entity ua are disassembled to obtain the multiple rules, the recognition rate and the recognition accuracy of the multiple rule recognition apps are obtained through detailed calculation according to the flow track, the rule with the recognition rate and the recognition accuracy meeting the preset conditions can be selected and used as the target recognition rule of the apps for recommendation, automatic recommendation of the target recognition rule of the apps is achieved, recommendation efficiency is greatly improved, and the technical problem that the efficiency is low due to the fact that the app recognition rule in the related technology depends on manual processing is solved.
It should be noted that, according to the identification rate and the identification accuracy rate of the identification application program, the method determines whether the multiple rules meet the preset conditions, so that the accuracy of identifying the APP by the recommended target identification rule is improved, for example, even if some software development toolkits sdk embedded in the APP have confused rules, the preset conditions cannot be met, the target identification rule of the APP cannot be determined, and the rule may possibly find changes along with the update of the APP, and the APP check of the approved confirmation rule is accumulated historically, so that the regular full verification can be achieved, the flexibility is improved, and the labor cost is greatly reduced.
In an embodiment of the present application, the parsing the target uniform resource locator and the target user agent entity to obtain a plurality of rules includes: identifying domain names at all levels, url directories at all levels and key words in the url directories at all levels in the target uniform resource locator; according to the domain names at all levels, the url directories at all levels and the keywords in the url directories at all levels, the target uniform resource locator is disassembled into a plurality of parts; the plurality of portions is determined as a plurality of the rules. Specifically, a url field of the target uniform resource locator is decomposed into a plurality of rules, and the plurality of rules are used as alternatives, so as to facilitate subsequent screening of rules meeting predetermined conditions for recommendation, where the url field includes domain names at various levels, url directories at various levels, and keywords in the url directories at various levels, for example, as shown in fig. 2, an identification rule of a certain mobile app includes a ua keyword, a host field, and a url expression.
In an embodiment of the present application, the parsing the target uniform resource locator and the target user agent entity to obtain a plurality of rules includes: identifying keywords in the target user agent entity; and decomposing the target user agent entity into a plurality of parts according to a preset rule according to the keyword in the target user agent entity, and determining the plurality of parts into a plurality of rules. Specifically, the ua field of the target user agent entity is disassembled into a plurality of rules according to preset rules, the plurality of rules are used as alternatives, so that rules meeting preset conditions are subsequently screened out for recommendation, and in addition, the preset rules can be set according to actual conditions.
In an embodiment of the application, the obtaining of the traffic track details of the application to be identified in the preset time period includes: acquiring at least one operation flow on the application program to be identified and recording the operation flow as an operation log, wherein the operation log at least comprises the following contents: installing a device id of the application program to be identified, an id of the application program to be identified, a name of the application program to be identified, a start operation time of the operation flow, and an end time of the operation flow; collecting flow data corresponding to the application program to be identified, wherein the flow data at least comprises http data and https data; and associating the operation log and the flow data into the flow track detail. Specifically, an operation log of the app is obtained, where the operation log at least includes a device id for installing the application program to be identified, an id of the application program to be identified, a name of the application program to be identified, start operation time of the operation pipeline, and end time of the operation pipeline, where key fields of the traffic data include a device id, a resource locator url, a user agent entity ua, and http request time, and when the operation log and the traffic data are associated, the device ids of the two must be matched, and the http request time needs to be within an operation time range, where the device id may be identified by fields capable of distinguishing different mobile phones, such as a vpn or a proxy account, a client ip, and a port, and the like, and the operation pipeline represents a mobile phone to perform centralized operation on an app in a short period of time and may be identified by the device id + the operation start time, the key fields of the obtained traffic track detail include device id, operation running water, id of app, app name, app operation start time, app operation end time, resource locator url, user agent entity ua and http request time. In addition, the flow track detail can be generated by directly combining app monitoring data acquired by a Software Development Kit (SDK) with an operator internet log.
In an embodiment of the application, acquiring traffic data corresponding to the application to be identified includes: establishing a unified access flow outlet of the application program to be identified; and collecting the flow data through the unified outlet. Specifically, an app flow access unified outlet is established as a data acquisition server, the data acquisition server may be a proxy server, a vpn server, a gateway, and the like, and the http data are uniformly acquired by the http data acquisition server, so that the related data can be acquired conveniently.
In an embodiment of the application, determining the recognition rate and the recognition accuracy of the plurality of rules for the application to be recognized according to the traffic trajectory details includes: matching a plurality of rules with the content in the traffic track detail, and counting the successful matching times of each rule and the content; counting the number of the operation pipelines contained in the traffic trace detail; and calculating the recognition rate of each rule to the application program to be recognized according to the number of times of successful matching of each rule and the number of the operation pipelines. Specifically, the identification rate of the rule to the APP is the ratio of the number of times that the rule identifies the APP to the number of times that the rule operates the running water, the calculation process is simple, and the recommendation efficiency is further improved, for example, in six times of operation running water in which APP a is simulated to click on the APP, four rules of rule 1, rule 2, rule 3, and rule 4 are generated, where rule 1 identifies 5 times, rule 2 identifies 3 times, and rule 3 and rule 4 each identify 1 time, the identification rate of rule 1 is 5/6, that is, the identification rate is 83%.
In an embodiment of the application, determining the recognition rate and the recognition accuracy of the plurality of rules for the application to be recognized according to the traffic trajectory details includes: counting the number of the application programs to be identified by each rule; and calculating the identification accuracy of each rule according to the number of the identified application programs to be identified. Specifically, the identification accuracy of the rule is the reciprocal of the number of apps identified by the rule, the calculation process is simple, and the recommendation efficiency is further improved, for example, if two apps can be identified by rule 1 at the same time, the identification accuracy of rule 1 is 50%.
In an embodiment of the application, recommending the rule meeting the preset condition as the identification rule of the application program to be identified includes: filtering out first rules corresponding to the recognition rates lower than a first threshold value; filtering out a second rule corresponding to the identification accuracy rate lower than a second threshold value; and determining the rule excluding the first rule and the second rule as the identification rule of the application program to be identified. Specifically, both the first rule corresponding to the recognition rate lower than the first threshold and the second rule corresponding to the recognition accuracy lower than the second threshold are rules that do not satisfy the predetermined condition, the rules that do not satisfy the predetermined condition are removed, and the remaining rules can be recommended as target recognition rules, where the first threshold and the second threshold may be selected according to actual situations, for example, the first threshold is 30% and the second threshold is 80% to further improve the accuracy of app recognition by the recommended target recognition rules.
The embodiment of the present invention further provides a device for recommending an application identification rule, and it should be noted that the device for recommending an application identification rule according to the embodiment of the present invention may be used to execute the method for recommending an application identification rule according to the embodiment of the present invention. The following describes an apparatus for recommending application identification rules according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of an apparatus for recommending application identification rules according to an embodiment of the present invention. As shown in fig. 3, the apparatus includes:
a first obtaining unit 10, configured to obtain a target uniform resource locator and a target user agent entity from an operation flow performed by accessing an application to be identified;
a processing unit 20, configured to disassemble the target uniform resource locator and the target user agent entity to obtain a plurality of rules;
a second obtaining unit 30, configured to obtain details of a flow trajectory of the application to be identified in a preset time period;
a first determining unit 40, configured to determine, according to the traffic trajectory details, an identification rate and an identification accuracy rate for identifying the application program by using the plurality of rules;
the second determining unit 50 is configured to recommend, as a target identification rule of the application program, the rule that meets a preset condition in the plurality of rules according to the identification rate and the identification accuracy corresponding to the plurality of rules.
In the apparatus for recommending application identification rules, a first obtaining unit obtains a target uniform resource locator and a target user agent entity from an operation flow accessing an application to be identified, a processing unit disassembles the target uniform resource locator and the target user agent entity to obtain a plurality of rules, a second obtaining unit obtains a flow trajectory detail of the application to be identified in a preset time period, a first determining unit determines an identification rate and an identification accuracy rate for identifying the application by using the plurality of rules according to the flow trajectory detail, and a second determining unit recommends the rule meeting a preset condition in the plurality of rules as a target identification rule of the application according to the identification rate and the identification accuracy rate corresponding to the plurality of rules. The device obtains a plurality of rules by disassembling the uniform resource locator url and the user agent entity ua, and obtains the recognition rate and the recognition accuracy of the app according to the flow track detail calculation, namely, the rule with the recognition rate and the recognition accuracy meeting the preset conditions can be selected as the target recognition rule of the app for recommendation, so that the automatic recommendation of the target recognition rule of the app is realized, the recommendation efficiency is greatly improved, and the technical problem of low efficiency caused by the fact that the app recognition rule depends on manual processing in the related technology is solved.
It should be noted that, according to the identification rate and the identification accuracy rate of the identification application program, the method determines whether the multiple rules meet the preset conditions, so that the accuracy of identifying the APP by the recommended target identification rule is improved, for example, even if some software development toolkits sdk embedded in the APP have confused rules, the preset conditions cannot be met, the target identification rule of the APP cannot be determined, and the rule may possibly find changes along with the update of the APP, and the APP check of the approved confirmation rule is accumulated historically, so that the regular full verification can be achieved, the flexibility is improved, and the labor cost is greatly reduced.
In an embodiment of the present application, the processing unit includes a first identification module, a first processing module, and a first determination module, where the first identification module is configured to identify domain names at different levels, url directories at different levels, and keywords in the url directories at different levels in the target uniform resource locator; the first processing module is used for decomposing the target uniform resource locator into a plurality of parts according to the domain names at all levels, the url directories at all levels and the keywords in the url directories at all levels; the first determining module is configured to determine the plurality of portions as a plurality of the rules. Specifically, a url field of the target uniform resource locator is decomposed into a plurality of rules, and the plurality of rules are used as alternatives, so as to facilitate subsequent screening of rules meeting predetermined conditions for recommendation, where the url field includes domain names at various levels, url directories at various levels, and keywords in the url directories at various levels, for example, as shown in fig. 2, an identification rule of a certain mobile app includes a ua keyword, a host field, and a url expression.
In an embodiment of the application, the processing unit further includes a second identification module and a second processing module, where the second identification module is configured to identify a keyword in the target user agent entity; the second processing module is configured to split the target user agent entity into a plurality of parts according to a preset rule according to the keyword in the target user agent entity, and determine the plurality of parts as a plurality of rules. Specifically, the ua field of the target user agent entity is disassembled into a plurality of rules according to preset rules, the plurality of rules are used as alternatives, so that rules meeting preset conditions are subsequently screened out for recommendation, and in addition, the preset rules can be set according to actual conditions.
In an embodiment of the application, the second obtaining unit includes a first obtaining module, a second obtaining module, and a third processing module, where the first obtaining module is configured to obtain at least one operation pipeline on the application to be identified and record the operation pipeline as an operation log, and the operation log at least includes the following contents: installing a device id of the application program to be identified, an id of the application program to be identified, a name of the application program to be identified, a start operation time of the operation flow, and an end time of the operation flow; the second acquisition module is used for acquiring flow data corresponding to the application program to be identified, wherein the flow data at least comprises http data and https data; the third processing module is configured to associate the operation log and the traffic data with the traffic trajectory details. Specifically, an operation log of the app is obtained, where the operation log at least includes a device id for installing the application program to be identified, an id of the application program to be identified, a name of the application program to be identified, start operation time of the operation pipeline, and end time of the operation pipeline, where key fields of the traffic data include a device id, a resource locator url, a user agent entity ua, and http request time, and when the operation log and the traffic data are associated, the device ids of the two must be matched, and the http request time needs to be within an operation time range, where the device id may be identified by fields capable of distinguishing different mobile phones, such as a vpn or a proxy account, a client ip, and a port, and the like, and the operation pipeline represents a mobile phone to perform centralized operation on an app in a short period of time and may be identified by the device id + the operation start time, the key fields of the obtained traffic track detail include device id, operation running water, id of app, app name, app operation start time, app operation end time, resource locator url, user agent entity ua and http request time. In addition, the flow track detail can be generated by directly combining app monitoring data acquired by a Software Development Kit (SDK) with an operator internet log.
In an embodiment of the application, the second obtaining module includes an establishing submodule and an acquiring submodule, where the establishing submodule is configured to establish a uniform access traffic outlet of the application to be identified; the collecting submodule is used for collecting the flow data through the unified outlet. Specifically, an app flow access unified outlet is established as a data acquisition server, the data acquisition server may be a proxy server, a vpn server, a gateway, and the like, and the http data are uniformly acquired by the http data acquisition server, so that the related data can be acquired conveniently.
In an embodiment of the application, the first determining unit includes a fourth processing module, a first statistical module and a first calculating module, where the fourth processing module is configured to match a plurality of rules with content in the traffic trajectory detail, and count the number of times each rule is successfully matched with the content; the first statistical module is configured to count the number of the operation pipelines included in the traffic trajectory specification; the first calculating module is configured to calculate an identification rate of each rule for the application program to be identified according to the number of times that each rule is successfully matched and the number of the operation pipelines. Specifically, the identification rate of the rule to the APP is the ratio of the number of times that the rule identifies the APP to the number of times that the rule operates the running water, the calculation process is simple, and the recommendation efficiency is further improved, for example, in six times of operation running water in which APP a is simulated to click on the APP, four rules of rule 1, rule 2, rule 3, and rule 4 are generated, where rule 1 identifies 5 times, rule 2 identifies 3 times, and rule 3 and rule 4 each identify 1 time, the identification rate of rule 1 is 5/6, that is, the identification rate is 83%.
In an embodiment of the application, the first determining unit further includes a second counting module and a second calculating module, where the second counting module is configured to count the number of the to-be-identified applications identified by each rule; the second calculating module is configured to calculate the recognition accuracy of each rule according to the number of the recognized applications to be recognized. Specifically, the identification accuracy of the rule is the reciprocal of the number of apps identified by the rule, the calculation process is simple, and the recommendation efficiency is further improved, for example, if two apps can be identified by rule 1 at the same time, the identification accuracy of rule 1 is 50%.
In an embodiment of the application, the second determining unit includes a fifth processing module, a sixth processing module, and a second determining module, where the fifth processing module is configured to filter out a first rule corresponding to the recognition rate lower than a first threshold; the sixth processing module is configured to filter out a second rule corresponding to the recognition accuracy that is lower than a second threshold; the second determining module is configured to determine a rule excluding the first rule and the second rule as an identification rule of the application to be identified. Specifically, both the first rule corresponding to the recognition rate lower than the first threshold and the second rule corresponding to the recognition accuracy lower than the second threshold are rules that do not satisfy the predetermined condition, the rules that do not satisfy the predetermined condition are removed, and the remaining rules can be recommended as target recognition rules, where the first threshold and the second threshold may be selected according to actual situations, for example, the first threshold is 30% and the second threshold is 80% to further improve the accuracy of app recognition by the recommended target recognition rules.
The device for recommending the application program identification rule comprises a processor and a memory, wherein the first acquiring unit, the processing unit, the second acquiring unit, the first determining unit, the second determining unit and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more than one, and the technical problem of low efficiency caused by dependence on manual processing of the app identification rule in the related technology is solved by adjusting the kernel parameters.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.
An embodiment of the present invention provides a storage medium on which a program is stored, which when executed by a processor implements the above-described method.
The embodiment of the invention provides a processor, which is used for running a program, wherein the method is executed when the program runs.
The embodiment of the invention provides equipment, which comprises a processor, a memory and a program which is stored on the memory and can run on the processor, wherein the processor executes the program and realizes the following steps:
step S101, obtaining a target uniform resource locator and a target user agent entity from operation flow of accessing an application program to be identified;
step S102, the target uniform resource locator and the target user agent entity are disassembled to obtain a plurality of rules;
step S103, obtaining the flow track details of the application program to be identified in a preset time period;
step S104, determining the recognition rate and the recognition accuracy rate of recognizing the application program by adopting the plurality of rules according to the flow track details;
step S105, recommending the rule meeting the preset condition in the plurality of rules as the target identification rule of the application program according to the identification rate and the identification accuracy corresponding to the plurality of rules.
The device herein may be a server, a PC, a PAD, a mobile phone, etc.
The invention also provides a computer program product adapted to perform a program for initializing the following method steps when executed on a data processing device:
step S101, obtaining a target uniform resource locator and a target user agent entity from operation flow of accessing an application program to be identified;
step S102, the target uniform resource locator and the target user agent entity are disassembled to obtain a plurality of rules;
step S103, obtaining the flow track details of the application program to be identified in a preset time period;
step S104, determining the recognition rate and the recognition accuracy rate of recognizing the application program by adopting the plurality of rules according to the flow track details;
step S105, recommending the rule meeting the preset condition in the plurality of rules as the target identification rule of the application program according to the identification rate and the identification accuracy corresponding to the plurality of rules.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include transitory computer readable media (trans-entity media) such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above are merely examples of the present invention, and are not intended to limit the present invention. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (11)

1. A method for recommending application identification rules, comprising:
acquiring a target uniform resource locator and a target user agent entity from operation flow for accessing an application program to be identified;
the target uniform resource locator and the target user agent entity are disassembled to obtain a plurality of rules;
obtaining the flow track detail of the application program to be identified in a preset time period;
determining the recognition rate and the recognition accuracy rate of recognizing the application program by adopting the multiple rules according to the flow track details;
and recommending the rules meeting preset conditions in the plurality of rules as target identification rules of the application program according to the identification rates and the identification accuracy rates corresponding to the plurality of rules.
2. The method of claim 1, wherein the parsing the target uniform resource locator and the target user agent entity to obtain the plurality of rules comprises:
identifying domain names at all levels, url directories at all levels and key words in the url directories at all levels in the target uniform resource locator;
according to the domain names at all levels, the url directories at all levels and the keywords in the url directories at all levels, the target uniform resource locator is disassembled into a plurality of parts;
determining the plurality of portions as a plurality of the rules.
3. The method of claim 1, wherein the parsing the target uniform resource locator and the target user agent entity to obtain the plurality of rules comprises:
identifying keywords in the target user agent entity;
and decomposing the target user agent entity into a plurality of parts according to a preset rule according to the keyword in the target user agent entity, and determining the plurality of parts as a plurality of rules.
4. The method of claim 1, wherein obtaining the traffic trajectory details of the application to be identified within a preset time period comprises:
acquiring at least one operation flow on the application program to be identified and recording the operation flow as an operation log, wherein the operation log at least comprises the following contents: installing the device id of the application program to be identified, the name of the application program to be identified, the starting operation time of the operation pipeline and the ending time of the operation pipeline;
collecting flow data corresponding to the application program to be identified, wherein the flow data at least comprises http data and https data;
and associating the operation log and the flow data into the flow track detail.
5. The method of claim 4, wherein collecting the traffic data corresponding to the application to be identified comprises:
establishing a unified access flow outlet of the application program to be identified;
and collecting the flow data through the unified outlet.
6. The method of claim 1, wherein determining the recognition rate and recognition accuracy rate of the plurality of rules for the application to be recognized according to the traffic trajectory details comprises:
matching the plurality of rules with the content in the traffic track detail, and counting the successful matching times of each rule and the content;
counting the number of the operation running water contained in the traffic trace detail;
and calculating the recognition rate of each rule to the application program to be recognized according to the successful matching times of each rule and the number of the operation flow.
7. The method of claim 6, wherein determining the recognition rate and recognition accuracy rate of the plurality of rules for the application to be recognized according to the traffic trajectory details comprises:
counting the number of the application programs to be identified by each rule;
and calculating the identification accuracy of each rule according to the number of the identified application programs to be identified.
8. The method according to claim 7, wherein recommending the rule meeting a preset condition as the identification rule of the application program to be identified comprises:
filtering out first rules corresponding to the recognition rate lower than a first threshold value;
filtering out a second rule corresponding to the identification accuracy rate lower than a second threshold value;
and determining a rule which eliminates the first rule and the second rule as an identification rule of the application program to be identified.
9. An apparatus for recommending application identification rules, comprising:
the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a target uniform resource locator and a target user agent entity from operation flow for accessing an application program to be identified;
a processing unit, configured to disassemble the target uniform resource locator and the target user agent entity to obtain a plurality of rules;
the second acquisition unit is used for acquiring the flow track details of the application program to be identified in a preset time period;
the first determining unit is used for determining the recognition rate and the recognition accuracy rate of recognizing the application program by adopting the multiple rules according to the flow track details;
and the second determining unit is used for recommending the rule which meets the preset condition in the rules as the target identification rule of the application program according to the identification rate and the identification accuracy rate corresponding to the rules.
10. A computer-readable storage medium, comprising a stored program, wherein the program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the method of any one of claims 1 to 8.
11. A processor, characterized in that the processor is configured to run a program, wherein the program when running performs the method of any of claims 1 to 8.
CN202110914747.8A 2021-08-10 2021-08-10 Method and device for recommending application program identification rules Pending CN113742557A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110914747.8A CN113742557A (en) 2021-08-10 2021-08-10 Method and device for recommending application program identification rules

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110914747.8A CN113742557A (en) 2021-08-10 2021-08-10 Method and device for recommending application program identification rules

Publications (1)

Publication Number Publication Date
CN113742557A true CN113742557A (en) 2021-12-03

Family

ID=78730702

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110914747.8A Pending CN113742557A (en) 2021-08-10 2021-08-10 Method and device for recommending application program identification rules

Country Status (1)

Country Link
CN (1) CN113742557A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109033224A (en) * 2018-06-29 2018-12-18 阿里巴巴集团控股有限公司 A kind of Risk Text recognition methods and device
US20200104333A1 (en) * 2017-06-26 2020-04-02 Beijing Sankuai Online Technology Co., Ltd Information recommending method and device
CN111740923A (en) * 2020-06-22 2020-10-02 北京神州泰岳智能数据技术有限公司 Method and device for generating application identification rule, electronic equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200104333A1 (en) * 2017-06-26 2020-04-02 Beijing Sankuai Online Technology Co., Ltd Information recommending method and device
CN109033224A (en) * 2018-06-29 2018-12-18 阿里巴巴集团控股有限公司 A kind of Risk Text recognition methods and device
CN111740923A (en) * 2020-06-22 2020-10-02 北京神州泰岳智能数据技术有限公司 Method and device for generating application identification rule, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王立华等: "基于关联规则的渔业信息推荐系统设计与实现", 《农业工程学报》 *

Similar Documents

Publication Publication Date Title
CN107145489B (en) Information statistics method and device for client application based on cloud platform
US20150067140A1 (en) Predicting service delivery metrics using system performance data
CN105721187A (en) Service fault diagnosis method and apparatus
CN105335204A (en) Grey level distribution control method and grey level distribution control device of software program
CN109934268B (en) Abnormal transaction detection method and system
CN112580914A (en) Method and device for realizing enterprise-level data middling platform system for collecting multi-source data
CN108632111A (en) Service link monitoring method based on log
JP6324534B2 (en) Promotion status data monitoring method, apparatus, device, and non-executable computer storage medium
CN110780882A (en) Code file processing method, device and system, electronic equipment and storage medium
CN110209562A (en) A kind of log analysis method and Analysis server
CN112583944B (en) Processing method and device for updating domain name certificate
CN109064211B (en) Marketing business data analysis method and device and server
CN112347144B (en) Service index query method and device and server
CN114281648A (en) Data acquisition method and device, electronic equipment and storage medium
CN113987401A (en) Recording method and device of network general log, storage medium and processor
CN113742557A (en) Method and device for recommending application program identification rules
CN111435327B (en) Log record processing method, device and system
CN108667893B (en) Data recommendation method and device and electronic equipment
CN109597743B (en) Page circling method, click rate statistical method and related equipment
CN113468384B (en) Processing method, device, storage medium and processor for network information source information
CN114548631A (en) Dynamic evaluation method and device
CN107948739B (en) Method and device for calculating number of users for internet protocol television reuse
CN109561121B (en) Method and device for monitoring deployment
US10027533B2 (en) System for cloud-based service outage detection and verification
US20200151080A1 (en) Utilizing application performance management automatic discovery data for plugin priority

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211203