WO2021004344A1 - 基于数据分析的风险识别方法及相关设备 - Google Patents

基于数据分析的风险识别方法及相关设备 Download PDF

Info

Publication number
WO2021004344A1
WO2021004344A1 PCT/CN2020/099556 CN2020099556W WO2021004344A1 WO 2021004344 A1 WO2021004344 A1 WO 2021004344A1 CN 2020099556 W CN2020099556 W CN 2020099556W WO 2021004344 A1 WO2021004344 A1 WO 2021004344A1
Authority
WO
WIPO (PCT)
Prior art keywords
tags
risk
list
group
combined
Prior art date
Application number
PCT/CN2020/099556
Other languages
English (en)
French (fr)
Inventor
陈伟
陈伟平
马倩
高瀚
王辉
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021004344A1 publication Critical patent/WO2021004344A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities

Definitions

  • This application relates to the field of computer technology, and in particular to a risk identification method and related equipment based on data analysis.
  • risk identification in order to improve the rate of risk identification, is usually carried out on a certain area or multiple areas on a large scale, such as the prediction of violations or crimes.
  • the area to be predicted can be divided into multiple grids.
  • the random division method in the seismic field and the kernel density in mathematical statistics are used.
  • the estimation method determines the risk situation of each of the multiple squares.
  • the embodiments of the present application provide a risk identification method and related equipment based on data analysis, which can solve the technical problems of lack of pertinence in the prior art risk identification process and low identification accuracy.
  • an embodiment of the present application provides a risk identification method based on data analysis, including: acquiring risk data of a target object in a target scenario; the risk data includes at least one label for risk prediction; The risk data and the risk prediction model of the target scene are used to determine the risk prediction result of the target object; wherein the risk preset model is constructed according to at least one set of combination tags in the target scene, and the at least one set of combinations Each group of combined tags in the tags includes multiple tags; the tags included in the multiple tags are connected by logical connectives; when the risk prediction result indicates that the target object is at risk, the target object’s information Add to the mark list; collect the behavior data of the target object within a preset time range, and determine the category to which the behavior data belongs according to the preset behavior evaluation rules; generate information including the target object and the behavior data The file of the category to which it belongs.
  • an embodiment of the present application provides a risk identification device based on data analysis, including: an acquiring unit, configured to acquire risk data of a target object in a target scenario; and the risk data includes at least one device for risk prediction Label; a determining unit, configured to determine the risk prediction result of the target object according to the risk data and the risk prediction model of the target scene; wherein the risk preset model is based on at least one set of the target scene Combination tag construction, each of the at least one group of combination tags includes multiple tags; the tags included in the multiple tags are connected by logical connectives; an adding unit is used when the risk prediction result indicates When the target object is at risk, the information of the target object is added to the mark list; the processing unit is used to collect the behavior data of the target object within a preset time range, and determine the target object according to the preset behavior evaluation rules.
  • the category to which the behavior data belongs is described, and a file including the information of the target object and the category to which the behavior data belongs is generated.
  • an embodiment of the present application provides an electronic device, including a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store a computer program, and the computer program includes program instructions
  • the processor is configured to call the program instructions to perform the following steps: obtain risk data of the target object in the target scene; the risk data includes at least one label for risk prediction; according to the risk data and The risk prediction model of the target scene determines the risk prediction result of the target object; wherein the risk preset model is constructed according to at least one set of combined tags in the target scene, and each of the at least one set of combined tags
  • the group combination tag includes a plurality of tags; the tags included in the plurality of tags are connected by logical connectives; when the risk prediction result indicates that the target object is at risk, the information of the target object is added to the tag List; collect behavior data of the target object within a preset time range, and determine the category to which the behavior data belongs according to preset behavior evaluation rules; generate information including the target object and the category to which
  • an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement the following steps: Obtain the target object in the target scene The risk data; the risk data includes at least one label for risk prediction; the risk prediction result of the target object is determined according to the risk data and the risk prediction model of the target scene; wherein the risk preset The model is constructed based on at least one set of combined tags in the target scenario, each set of combined tags in the at least one set of combined tags includes multiple tags; the tags included in the multiple tags are connected by logical connectives; when When the risk prediction result indicates that the target object is at risk, the information of the target object is added to the mark list; the behavior data of the target object within a preset time range is collected and determined according to preset behavior evaluation rules The category to which the behavior data belongs; and a file including the information of the target object and the category to which the behavior data belongs is generated.
  • the electronic device can obtain the risk data of the target object in the target scene, and determine the risk prediction result of the target object according to the risk data and the risk prediction model of the target scene; the electronic device can be used as the risk prediction result
  • the electronic device can be used as the risk prediction result
  • add the target object's information to the mark list and collect the target object's behavior data within a preset time range, and determine the category of the behavior data according to the preset behavior determination rules, and Generating a file that includes the information of the target object and the category to which the behavior data belongs can make the risk identification process more targeted and improve the accuracy of risk identification.
  • FIG. 1 is a schematic flowchart of a method for risk identification based on data analysis provided by an embodiment of the present application.
  • Fig. 2 is a schematic flowchart of another risk identification method based on data analysis provided by an embodiment of the present application.
  • Fig. 3 is a schematic structural diagram of a risk identification device based on data analysis provided by an embodiment of the present application.
  • Fig. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of a method for risk identification based on data analysis provided by an embodiment of this application.
  • This method can be applied to electronic devices.
  • the electronic device can be a terminal or a server.
  • the terminal can be a tablet computer, a notebook computer, or a desktop computer.
  • the server can be a server or a server cluster. Specifically, the method may include the following steps.
  • the target object can be any object, or can be any object in the target scene, or can be any object in the target scene that needs risk identification, or can also be input or search Any object that needs risk identification in the target scene.
  • This object includes but is not limited to people.
  • the target scene may be any scene, or may also be any scene that requires risk supervision, or may also be a scene where the frequency of accidents in multiple scenes is greater than or equal to the preset frequency.
  • the object can be further subdivided according to different scenes.
  • the object includes but is not limited to objects such as drivers.
  • the object includes but is not limited to objects such as customers or staff of the corresponding institution.
  • the object includes but is not limited to objects such as customers or staff of the corresponding institution.
  • the risk data includes at least one label used for risk prediction.
  • the label can be a keyword.
  • the risk data in a car accident risk scenario may include tags such as weather in the area, road conditions, and driver information (such as driver age and/or vehicle information).
  • the electronic device may obtain the risk data of the target object in the target scene from the information server corresponding to the target scene.
  • the information server includes but is not limited to at least one of the following: a traffic management server, a weather server, and a map server.
  • the electronic device may obtain driver information from a traffic management server, obtain weather information of the area where the target object is located from a weather server, and obtain driving and road condition information from a traffic management server or a map server.
  • the electronic device may send a risk data acquisition request to the information server corresponding to the target scene, and receive the risk data of the target object in the target scene returned by the information server in response to the risk data acquisition request.
  • the electronic device obtains risk data of multiple objects in the target scene, and queries the risk data of the target object in the target scene from the risk data of the multiple objects in the target scene.
  • the electronic device obtains a set of risk information of the target object in the target scene, the set of risk information includes at least one piece of information used for risk prediction, and the electronic device can perform label extraction on each piece of information in the set of risk information, Obtain the risk data of the target object in the target scene.
  • the electronic device may obtain the risk information set of the target object in the target scene from the information server corresponding to the target scene.
  • S102 Determine the risk prediction result of the target object according to the risk data and the risk prediction model of the target scene.
  • the risk preset model is constructed based on at least one set of combined tags in the target scenario, and each set of combined tags in the at least one set of combined tags includes multiple tags; each tag included in the multiple tags is connected by a logical connection word connection.
  • the logical conjunction may be "and" and/or "or.”
  • the at least one set of combined tags includes a first set of combined tags and a second set of combined tags.
  • the first group of combined labels is (label 1 and label 2 and label 3)
  • the second group of combined labels is (label 1 or label 2 or label 3).
  • the risk prediction model may be the correspondence between each group of combination tags in the at least one group of combined tags and the corresponding risk prediction result, or it may also be the relationship between the at least one group of combined tags and the at least one group of combined tags.
  • the risk prediction result corresponding to each group of combined tags in a group of combined tags is input into the preset model for training, and the trained preset model is obtained.
  • the risk prediction result may be the accident rate.
  • the risk prediction result is the accident rate, if the accident rate is greater than or equal to the preset value, the risk prediction result indicates that the target object is at risk.
  • the risk prediction result may also be a result indicating whether there is a risk or a result indicating whether an accident has occurred, and the result includes but is not limited to being presented in the form of numbers, words, letters, etc.
  • the risk prediction result indicates that the target object is at risk.
  • the risk prediction result may also include the category of the risk accident.
  • the electronic device determines the risk prediction result of the target object according to the risk data and the risk prediction model of the target scene, which may include: the electronic device determines the risk prediction result from each group of combined tags and the corresponding risk prediction result.
  • the risk prediction result corresponding to the risk data is queried, and the risk prediction result corresponding to the risk data is determined as the risk prediction result of the target object.
  • the embodiment of the application can easily determine the risk prediction result of the target object by querying the corresponding relationship.
  • the electronic device determines the risk prediction result of the target object according to the risk data and the risk prediction model of the target scene, which may include: the electronic device inputs the risk data to the pre-trained preset model to Perform risk prediction, and output the risk prediction result of the target object through the preset model after the training.
  • the risk prediction result of the target object is obtained through the model, and the risk prediction result of the target object can be determined quickly and accurately.
  • the electronic device may add the information of the target object to the mark list when the risk prediction result indicates that the target object is at risk.
  • the target object can be listed as a key supervision object, which can better manage the risky objects in the target scene, and facilitate the further follow-up of the target object Follow up and grasp the behavior trends of the target object.
  • the target object's information may include the target object's identification, such as the target object's name, image (such as an avatar), ID number, contact information, and other information used to uniquely identify the target object.
  • the information of the target object may also include other information of the target object, such as information such as the work location and residential area of the target object, which are not listed in the embodiment of the present application.
  • the mark list can be used to record information about objects at risk in the target scene. For example, the mark list may be used to record information of users who are at risk of a car accident in a car accident risk scenario.
  • S104 Collect behavior data of the target object within a preset time range, and determine the category to which the behavior data belongs according to a preset behavior evaluation rule.
  • the preset time range can be any time range, which can be specifically set according to actual needs. For example, it may be the past three months starting from the current time, and/or it may also be the next three months starting from the current time.
  • the behavior data can be different according to different scenarios.
  • the behavior data may include data such as driving data.
  • the driving data includes, but is not limited to, photographed or recorded driving records of the target object, such as driving violation data.
  • the traffic violation data includes but is not limited to at least one of the following: red light running records, rear-end collision records, speeding records, and pedestrian crossing records.
  • the behavior data may include transaction data and other data.
  • the category can be a level, such as level 1, level 2, or level 3.
  • the behavior severity of the first level instruction is lower than the behavior severity of the second level instruction
  • the behavior severity of the second level instruction is lower than the behavior severity of the third level instruction.
  • this category can also be unethical, illegal, and convicted.
  • the electronic device may collect behavior data of the target object within a preset time range from the information server. For example, in a car accident risk scenario, the electronic device may collect the driving data of the target object from the traffic management server.
  • the electronic device determines the category to which the behavior data belongs according to a preset behavior determination rule, which may include: the electronic device inputs the behavior data into a preset classification model, and compares the behavior data via the classification model. The behavior data is classified to obtain the category to which the behavior data belongs.
  • the classification model may be the designated model obtained after training the designated model by using the collected behavior data training set and the category to which each behavior data in the behavior data training set belongs.
  • the electronic device determines the category to which the behavior data belongs according to a preset behavior determination rule, which may further include: the electronic device performs named entity recognition on the behavior data to extract each entity in the behavior data, and The behavior data is subjected to semantic analysis to obtain the association relationship between each entity, and the each entity and the association relationship between the entities are matched with behavior determination data under different categories, and the behavior data is determined according to the matching result.
  • Category The aforementioned association relationship may include behavior characteristics.
  • the behavior determination data may include a collection of legal provisions or a collection of legal rules.
  • the behavior determination data may further include a case collection, or a case collection corresponding to the legal provision, or a case collection corresponding to the legal rule.
  • the named entity recognition is performed on the behavior data to extract the entities in the behavior data: “Driver A” and “Red light”, and the behavior data Carrying out semantic analysis, get the association relationship between each entity "break”.
  • the electronic device matches “driver A”, “running” and “red light” with the behavior determination data (such as legal rules) under different categories, and determines the category of the behavior data as illegal according to the matching result.
  • the matching result may be the matched legal clause or legal rule, and the electronic device may determine the category corresponding to the matched legal clause or the category corresponding to the legal rule as the category to which the behavior data belongs.
  • the matched legal clause or legal rule may be the determined legal clause or legal rule with the highest degree of matching with the behavior data.
  • the matching result may also be a matched case, and the electronic device may determine the category corresponding to the matched case as the category to which the behavior data belongs.
  • the matched case may be the case determined to have the highest degree of matching with the behavior data.
  • the electronic device may also determine the category corresponding to the legal provision or legal rule corresponding to the matched case as the category to which the behavior data belongs.
  • each entity includes a subject and an object corresponding to each piece of data in the behavior data.
  • the association relationship between the entities includes the association relationship between the subject and the object corresponding to each piece of data in the behavior data.
  • S105 Generate a file including the information of the target object and the category to which the behavior data belongs.
  • the electronic device may generate a file including the information of the target object and the category to which the behavior data belongs, so as to archive and query the information of the target object and the category to which the behavior data belongs.
  • the format of the file includes but is not limited to any of the following: doc, docx, pdf, excel.
  • the file may be an analysis report.
  • the electronic device may also generate a file including the information of the target object, the risk prediction result of the target object, and the category to which the behavior data belongs.
  • the electronic device may also generate a file that includes the information of the target object, the risk prediction result of the target object, the category to which the behavior data belongs, and other auxiliary determination data.
  • the other auxiliary determination data may refer to data of some other dimensions except the behavior data.
  • the other auxiliary judgment data may include data such as a video of drinking alcohol, a video of entering and leaving places such as bars.
  • the other auxiliary determination data may be used to assist in analyzing the reason why the behavior data belongs to the category. That is, the electronic device can determine the reason why the behavior data belongs to the category based on the other auxiliary determination data. Or, the other auxiliary determination data may also be used to analyze which subcategory the behavior data belongs to. That is, the electronic device may also determine that the behavior data belongs to the target subcategory under the category based on the other auxiliary determination data.
  • the electronic device when the electronic device is a terminal, the electronic device can display the file.
  • the electronic device when the electronic device is a server, the electronic device can send the file to the corresponding device for display.
  • the electronic device may also output first alarm information to remind relevant personnel when the risk prediction result indicates that the target object is at risk.
  • the electronic device may also output second alarm information to remind relevant personnel when it is determined that the category to which the belonging behavior data belongs is a specified category.
  • the electronic device may also trigger the step of determining the category to which the behavior data belongs according to a preset behavior determination rule when the information of the target object is queried in the preset event table.
  • the preset event table records the information of the object that has the accident in the target scene. Using the above method can facilitate relevant personnel to characterize the accident.
  • the preset event table may be data such as traffic accident data or case filing data recorded by the traffic management server within a preset time period.
  • the electronic device can obtain the risk data of the target object in the target scene, and determine the risk prediction result of the target object according to the risk data and the risk prediction model of the target scene; electronic equipment When the risk prediction result indicates that the target object is at risk, the information of the target object is added to the mark list, and the behavior data of the target object within a preset time range is collected, and the behavior is determined according to the preset behavior determination rules The category to which the data belongs, and a file including the information of the target object and the category to which the behavior data belongs is generated, making the risk identification process more targeted and improving the accuracy of risk identification.
  • FIG. 2 is a schematic flowchart of another risk identification method based on data analysis provided by an embodiment of this application.
  • the method can be applied to an electronic device, and the electronic device can be a terminal or a server. Specifically, the method may include the following steps.
  • S201 Acquire a scene identifier of the target scene, and determine a target factor list corresponding to the scene identifier of the target scene according to a preset correspondence between the scene identifier and the factor list.
  • the scene identifier can be a scene name.
  • the target factor list refers to a factor list corresponding to the target scene, and the target factor list may include one or more factors.
  • the target factor list corresponding to the car accident risk scenario may include at least one of the following: weather, road conditions, driver, and age of the driver.
  • the above factors can be understood as a general term for a type of label.
  • weather may be a general term for labels describing weather such as sunny, cloudy, light rain, heavy rain, etc.
  • the electronic device may determine the target factor list corresponding to the scene identifier of the target scene from the factor library according to the preset correspondence between the scene identifier and the factor list. Among them, the factor library saves a list of factors corresponding to each scene.
  • the data inventory has at least one label list, and the corresponding relationship between each label list in the at least one label list and the factor list.
  • the target factor list includes a first factor and a second factor.
  • the first factor is road conditions
  • the label corresponding to the first factor in the label list may be congested road conditions, smooth road conditions, etc.
  • the second factor is If the driver has a vehicle age, the label corresponding to the second factor in the label list may be the driver’s vehicle age of 2 years and the driver’s age of 3 years.
  • the electronic device may determine the tag list corresponding to the target factor list from the at least one tag list included in the database according to the correspondence between each tag list in the at least one tag list included in the database and the factor list .
  • the electronic device may perform combination processing on each label in the label list to obtain at least one set of combined labels.
  • the electronic device may randomly sample each tag in the tag list to obtain multiple sets of tags, each of the multiple sets of tags includes multiple tags; the electronic device may be each tag included in the multiple sets of tags Add logical connectives between them to obtain at least one set of combined tags corresponding to each set of tags in the multiple sets of tags. For example, if multiple sets of labels include the first set of labels (label 1, label 2, label 3), add connecting words between each label in the first set of labels, and at least the following two sets of combined labels corresponding to the first set of labels can be obtained : The first group of combined tags (label 1 and label 2 and label 3), the second group of combined labels (label 1 or label 2 or label 3).
  • the aforementioned random sampling may be sampling with replacement.
  • the electronic device combines the tags in the tag list to obtain at least one set of combined tags, which may include: the electronic device randomly samples each tag in the tag list to obtain multiple sets of tags; Each set of tags in the multiple sets of tags includes multiple tags; the electronic device performs deduplication processing on the multiple sets of tags to obtain at least one set of tags; there are differences between the sets of tags in the at least one set of tags; the electronic device is A preset logical connection word is added between the tags included in each group of tags in the at least one group of tags to obtain at least one group of combined tags corresponding to each group of tags. Among them, the tags included in each group of tags are not repeated. With the above method, the repetition rate of multiple sets of tags can be effectively reduced, thereby reducing the repetition rate of at least one set of combined tags corresponding to each set of tags.
  • the electronic device randomly sampling each tag in the tag list may include: the electronic device obtains the weight set for each tag in the tag list The electronic device adopts a weighted random sampling algorithm to randomly sample each tag according to the weight set for each tag in the tag list. In one embodiment, the higher the weight, the higher the probability of being sampled.
  • the electronic device may in the process of de-duplication processing, each time a group of tags is sampled, query whether the saved groups of tags include the group of tags; if the existing groups of tags do not include the group of tags, Save the group of tags; if the existing groups of tags include the group of tags, discard (such as delete) the group of tags.
  • the electronic device combines the tags in the tag list to obtain at least one set of combined tags, which may include: the electronic device uses a recursive algorithm to arrange and combine the tags in the tag list to obtain multiple groups Labels, each set of labels includes multiple labels; the electronic device adds a preset logical connection word between the labels included in the multiple sets of labels to obtain at least one set of combined labels corresponding to each set of labels.
  • This application uses a recursive algorithm to obtain multiple sets of tags, which can improve the efficiency of obtaining multiple sets of tags.
  • the tag list includes n tags
  • the electronic device can select m tags from them, and arrange and combine the m tags to obtain multiple sets of tags corresponding to the m tags, and use the above recursive algorithm to obtain the multiple sets of tags
  • the process may include: selecting the first tag in the tag list as the first element of a group of tags, and selecting from other tags in the tag list that are located after the first tag (m-1) Labels as the remaining (m-1) elements of a set of labels, and so on, select the second to (n-m+1) labels in the label list as the first element of a set of labels, and Select (m-1) tags from other tags located after the second to (n-m+1) tags in the tag list as the remaining (m-1) elements of a group of tags.
  • each label in the aforementioned label list may be at least one label corresponding to different factors obtained after grouping according to their corresponding factors.
  • the electronic device extracts one label from at least one label corresponding to each factor each time to construct a set of labels. Therefore, the factors corresponding to each of the multiple tags included in each group of tags are different.
  • the electronic device may also perform deduplication processing on at least one group of combined tags corresponding to each group of tags.
  • the process of the de-duplication processing may be deleting the group with the same combination label in the at least one group of combination labels.
  • the electronic device may use the at least one set of combined tags to construct a risk prediction model of the target scene.
  • the electronic device uses the at least one set of combined tags to construct a risk prediction model of the target scene, which may include: the electronic device establishes each set of combined tags in the at least one set of combined tags and a corresponding risk prediction
  • the corresponding relationship between the results, and the corresponding relationship between each group of combined tags and the corresponding risk prediction results are determined as the risk prediction model of the target scene. Determining the corresponding relationship as a risk prediction model can quickly and effectively determine the risk prediction model.
  • the electronic device uses the at least one set of combined tags to construct a risk prediction model of the target scene, which may include: the electronic device combines the at least one set of combined tags and each of the at least one set of combined tags.
  • the risk prediction result corresponding to the group combination label is input to a preset model for training, the trained preset model is obtained, and the trained preset model is determined as the risk prediction model of the target scene. Determining the risk prediction model through modeling can improve the scalability and prediction accuracy of the risk prediction model.
  • S205 Acquire risk data of the target object in the target scene.
  • S206 Determine the risk prediction result of the target object according to the risk data and the risk prediction model of the target scene.
  • S208 Collect behavior data of the target object within a preset time range, and determine the category to which the behavior data belongs according to a preset behavior evaluation rule.
  • S209 Generate a file including the information of the target object and the category to which the behavior data belongs.
  • steps S205-S209 may refer to steps S101-S105 in the embodiment of FIG. 1, and details are not described herein in the embodiment of the present application.
  • the electronic device can obtain the scene identifier of the target scene, and determine the target factor list corresponding to the scene identifier of the target scene according to the preset correspondence between the scene identifier and the factor list;
  • the label list corresponding to the target factor list is matched in the database, and each label in the label list is combined to obtain at least one group of combined labels. Therefore, the at least one set of combined tags is used to construct the risk prediction model of the target scene.
  • This method can quickly and effectively construct the risk prediction model of the target scene, and realize the automatic and intelligent construction process of the risk prediction model of the target scene. , Improve the construction efficiency of the risk prediction model for the target scene.
  • FIG. 3 is a schematic structural diagram of a risk identification device based on data analysis provided by an embodiment of this application.
  • the device can be applied to electronic equipment.
  • the device may include the following units.
  • the obtaining unit 301 is configured to obtain risk data of the target object in the target scene; the risk data includes at least one tag used for risk prediction.
  • the determining unit 302 is configured to determine the risk prediction result of the target object according to the risk data and the risk prediction model of the target scene; wherein the risk preset model is based on at least one set of combinations in the target scene Tag construction, each of the at least one group of combined tags includes a plurality of tags; the tags included in the plurality of tags are connected by logical connectives.
  • the adding unit 303 is configured to add the information of the target object to the mark list when the risk prediction result indicates that the target object is at risk.
  • the processing unit 304 is configured to collect behavior data of the target object within a preset time range, determine the category to which the behavior data belongs according to preset behavior evaluation rules, and generate information including the target object and the The file of the category to which the behavior data belongs.
  • the determining unit 302 is further configured to obtain the scene identifier of the target scene through the acquiring unit 301, and determine the scene identifier corresponding to the target scene according to the preset correspondence between the scene identifier and the factor list A list of target factors; the target factor list includes one or more factors.
  • the processing unit 304 is further configured to match the tag list corresponding to the target factor list from the database; the database has at least one tag list, and the at least one tag list Correspondence between each tag list and factor list; combine each tag in the tag list to obtain at least one group of combined tags; use the at least one group of combined tags to construct a risk prediction model for the target scene.
  • the processing unit 304 performs combined processing on each tag in the tag list to obtain at least one set of combined tags, specifically, randomly sampling each tag in the tag list to obtain multiple sets of tags
  • Each of the multiple sets of tags includes multiple tags; the multiple sets of tags are deduplicated to obtain at least one set of tags; there is a difference between each set of tags in the at least one set of tags; A preset logical connection word is added between the tags included in each group of tags in the at least one group of tags to obtain at least one group of combined tags corresponding to each group of tags.
  • the processing unit 304 randomly samples each tag in the tag list, specifically to obtain the weight set for each tag in the tag list; adopts a weighted random sampling algorithm, according to According to the weight set for each tag in the tag list, random sampling is performed on each tag.
  • the processing unit 304 combines the tags in the tag list to obtain at least one set of combined tags, specifically using a recursive algorithm to arrange and combine the tags in the tag list to obtain Multiple sets of tags, each set of tags includes multiple tags; adding a preset logical connection word between the tags included in the multiple sets of tags, to obtain at least one set of combined tags corresponding to each set of tags.
  • the processing unit 304 uses the at least one set of combined tags to construct a risk prediction model of the target scene, specifically to establish each set of combined tags in the at least one set of combined tags and the corresponding The corresponding relationship between the risk prediction results, and the corresponding relationship between each group of combined tags and the corresponding risk prediction results are determined as the risk prediction model of the target scene; or, the at least one group of combined tags And the risk prediction result corresponding to each group of the combined tags in the at least one group of combined tags is input to a preset model for training, the trained preset model is obtained, and the trained preset model is determined as The risk prediction model of the target scene.
  • the processing unit 304 determines the category to which the behavior data belongs according to a preset behavior determination rule, specifically inputting the behavior data into a preset classification model, and comparing the behavior data through the classification model. Classify the behavior data to obtain the category to which the behavior data belongs; or perform named entity recognition on the behavior data to extract each entity in the behavior data, and perform semantic analysis on the behavior data to obtain each entity The relationship between each entity and the relationship between each entity is matched with behavior determination data under different categories, and the category to which the behavior data belongs is determined according to the matching result.
  • the electronic device can obtain the risk data of the target object in the target scene, and determine the risk prediction result of the target object according to the risk data and the risk prediction model of the target scene; electronic equipment When the risk prediction result indicates that the target object is at risk, the information of the target object is added to the mark list, and the behavior data of the target object within a preset time range is collected, and the behavior is determined according to the preset behavior determination rules
  • the category to which the data belongs, and the generation of a file that includes the information of the target object and the category to which the behavior data belongs can make the risk identification process more targeted and improve the accuracy of risk identification.
  • FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of this application.
  • the electronic device described in this embodiment may include a processor 1000 and a memory 2000.
  • the processor 1000 and the memory 2000 may be connected by a bus as shown in FIG. 4 or in other ways.
  • the electronic device may further include one or more input devices 3000 and one or more output devices 4000.
  • the processor 1000, the memory 2000, one or more input devices 3000, and one or more output devices 4000 may be connected by a bus or other methods.
  • the input device 3000 includes, but is not limited to, touch screens, voice recorders, sensors, and other devices.
  • the output device 4000 includes but is not limited to devices such as a display screen and a speaker.
  • the touch screen and display can also be replaced with a touch display.
  • the input device 3000 and the output device 4000 may include standard wired or wireless communication interfaces.
  • the processor 1000 may be a central processing unit (Central Processing Unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), and Application Specific Integrated Circuit (ASIC) , Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 2000 can be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as disk storage.
  • the memory 2000 is used to store a set of program codes, and the processor 1000, the input device 3000, and the output device 4000 can call the program codes stored in the memory 2000.
  • the processor 1000 is configured to obtain risk data of the target object in the target scene; the risk data includes at least one tag for risk prediction; and the risk prediction model of the target scene is determined according to the risk data The risk prediction result of the target object; wherein the risk preset model is constructed according to at least one set of combined tags in the target scene, and each set of combined tags in the at least one set of combined tags includes multiple tags; The tags included in the multiple tags are connected by logical connectives; when the risk prediction result indicates that the target object is at risk, the information of the target object is added to the tag list; the target object is collected in a preset The behavior data within the time range, and the category to which the behavior data belongs is determined according to a preset behavior evaluation rule; a file including the information of the target object and the category to which the behavior data belongs is generated.
  • the processor 1000 is further configured to obtain the scene identifier of the target scene, and determine the target factor list corresponding to the scene identifier of the target scene according to the preset correspondence between the scene identifier and the factor list; the target factor list It includes one or more factors; the tag list corresponding to the target factor list is matched from the database; the data library has at least one tag list, and the correspondence between each tag list in the at least one tag list and the factor list Combining each tag in the tag list to obtain at least one group of combined tags; using the at least one group of combined tags to construct a risk prediction model for the target scene.
  • the processor 1000 combines the tags in the tag list to obtain at least one set of combined tags, specifically, randomly sampling each tag in the tag list to obtain multiple sets of tags; the multiple sets of tags Each set of tags in the at least one set of tags includes multiple tags; the multiple sets of tags are deduplicated to obtain at least one set of tags; there are differences between each set of tags in the at least one set of tags; A preset logical connection word is added between the tags included in the group tags to obtain at least one group of combined tags corresponding to each group of tags.
  • the processor 1000 randomly samples each tag in the tag list, specifically obtaining a weight set for each tag in the tag list; adopts a weighted random sampling algorithm, according to The weights set by the tags are randomly sampled for each tag.
  • the processor 1000 combines the tags in the tag list to obtain at least one set of combined tags, specifically using a recursive algorithm to arrange and combine the tags in the tag list to obtain multiple sets of tags.
  • the tags include a plurality of tags; a preset logical connection word is added between the tags included in the plurality of tags to obtain at least one group of combined tags corresponding to each group of tags.
  • the processor 1000 uses the at least one set of combined tags to construct a risk prediction model of the target scene, specifically establishing the relationship between each set of combined tags in the at least one set of combined tags and the corresponding risk prediction result. Corresponding relationship, and determine the corresponding relationship between each group of combined tags and the corresponding risk prediction results as the risk prediction model of the target scene; or, combine the at least one group of combined tags and the at least one group of The risk prediction result corresponding to each group of combination labels in the combination label is input to a preset model for training, and the trained preset model is obtained, and the trained preset model is determined as the risk of the target scene Forecast model.
  • the processor 1000 determines the category to which the behavior data belongs according to a preset behavior determination rule, specifically inputting the behavior data into a preset classification model, and classifying the behavior data via the classification model , Obtain the category to which the behavior data belongs; or, perform named entity recognition on the behavior data to extract each entity in the behavior data, and perform a semantic analysis on the behavior data to obtain the association relationship between the various entities.
  • the entities and the association relationships between the entities are matched with behavior determination data in different categories, and the category to which the behavior data belongs is determined according to the matching result.
  • the processor 1000, the input device 3000, and the output device 4000 described in the embodiment of the present application can perform the implementation described in the embodiment of FIG. 1 and FIG. 2 as well as the implementation described in the embodiment of the present application. , I won’t repeat it here.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be realized in the form of sampling hardware or in the form of sampling software functional unit.
  • the embodiments of the present application also provide a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to achieve the following steps: obtain the target object in the target scene Risk data; the risk data includes at least one label for risk prediction; the risk prediction result of the target object is determined according to the risk data and the risk prediction model of the target scene; wherein the risk preset model Constructed according to at least one set of combined tags in the target scenario, each set of combined tags in the at least one set of combined tags includes multiple tags; the tags included in the multiple tags are connected by logical connectives; When the risk prediction result indicates that the target object is at risk, the information of the target object is added to the mark list; the behavior data of the target object within a preset time range is collected, and the behavior data is determined according to the preset behavior evaluation rules.
  • the category to which the behavior data belongs generate a file that includes the information of the target object and the category to which the behavior data belongs.
  • the program can be stored in a computer-readable storage medium. When executed, it may include the processes of the above-mentioned method embodiments.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the computer-readable storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random storage memory (Random Access Memory, RAM) etc.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Educational Administration (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种基于数据分析的风险识别方法及相关设备,可应用于智慧城市,如智慧交通领域,该方法包括:获取目标对象在目标场景下的风险数据(S101);所述风险数据包括至少一个用于风险预测的标签;根据所述风险数据以及所述目标场景的风险预测模型,确定所述目标对象的风险预测结果(S102);当所述风险预测结果指示所述目标对象存在风险时,将所述目标对象的信息添加至标记列表(S103);采集所述目标对象在预设时间范围内的行为数据,并按照预设的行为评定规则确定所述行为数据所属的类别(S104);生成包括所述目标对象的信息以及所述行为数据所属的类别的文件(S105)。上述方法可使风险识别过程更具有针对性,并提高风险识别的准确度。

Description

基于数据分析的风险识别方法及相关设备
本申请要求于2019年07月10日提交中国专利局、申请号为201910619081.6,发明名称为“基于数据分析的风险识别方法及相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,尤其涉及一种基于数据分析的风险识别方法及相关设备。
背景技术
目前,在风险识别过程中,为了提高风险识别速率,通常会大范围地对某一个或多个区域进行风险识别,如进行违规或犯罪等行为的预测。具体地,可以将待预测的区域划分为多个方格,根据该待预测区域的历史风险数据,如发生事故的事件的数量,采用地震领域中的随机除丛法和数理统计中的核密度估算法确定该多个方格中每个方格的风险情况。然而,发明人发现,该风险识别过程缺乏针对性,并且识别准确度较低。
技术问题
本申请实施例提供了一种基于数据分析的风险识别方法及相关设备,可以解决现有技术风险识别过程缺乏针对性,并且识别准确度较低的技术问题。
技术解决方案
第一方面,本申请实施例提供了一种基于数据分析的风险识别方法,包括:获取目标对象在目标场景下的风险数据;所述风险数据包括至少一个用于风险预测的标签;根据所述风险数据以及所述目标场景的风险预测模型,确定所述目标对象的风险预测结果;其中,所述风险预设模型根据所述目标场景下的至少一组组合标签构建,所述至少一组组合标签中每组组合标签包括多个标签;所述多个标签包括的各标签之间通过逻辑连接词连接;当所述风险预测结果指示所述目标对象存在风险时,将所述目标对象的信息添加至标记列表;采集所述目标对象在预设时间范围内的行为数据,并按照预设的行为评定规则确定所述行为数据所属的类别;生成包括所述目标对象的信息以及所述行为数据所属的类别的文件。
第二方面,本申请实施例提供了一种基于数据分析的风险识别装置,包括:获取单元,用于获取目标对象在目标场景下的风险数据;所述风险数据包括至少一个用于风险预测的标签;确定单元,用于根据所述风险数据以及所述目标场景的风险预测模型,确定所述目标对象的风险预测结果;其中,所述风险预设模型根据所述目标场景下的至少一组组合标签构建,所述至少一组组合标签中每组组合标签包括多个标签;所述多个标签包括的各标签之间通过逻辑连接词连接;添加单元,用于当所述风险预测结果指示所述目标对象存在风险时,将所述目标对象的信息添加至标记列表;处理单元,用于采集所述目标对象在预设时间范围内的行为数据,并按照预设的行为评定规则确定所述行为数据所属的类别,并生成包括所述目标对象的信息以及所述行为数据所属的类别的文件。
第三方面,本申请实施例提供了一种电子设备,包括处理器和存储器,所述处理器和所述存储器相互连接,其中,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,执行以下步骤:获取目标对象在目标场景下的风险数据;所述风险数据包括至少一个用于风险预测的标签;根据所述风险数据以及所述目标场景的风险预测模型,确定所述目标对象的风险预测结果;其中,所述风险预设模型根据所述目标场景下的至少一组组合标签构建,所述至少一组组合标签中每组组合标签包括多个标签;所述多个标签包括的各标签之间通过逻辑连接词连接;当所述风险预测结果指示所述目标对象存在风险时,将所述目标对象的信息添加至标记列表;采集所述目标对象在预设时间范围内的行为数据,并按照预设的行为评定规则确定所述行为数据所属的类别;生成包括所述目标对象的信息以及所述行为数据所属的类别的文件。
第四方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行以实现以下步骤:获取目标对象在目标场景下的风险数据;所述风险数据包括至少一个用于风险预测的标签;根据所述风险数据以及所述目标场景的风险预测模型,确定所述目标对象的风险预测结果;其中,所述风险预设模型根据所述目标场景下的至少一组组合标签构建,所述至少一组组合标签中每组组合标签包括多个标签;所述多个标签包括的各标签之间通过逻辑连接词连接;当所述风险预测结果指示所述目标对象存在风险时,将所述目标对象的信息添加至标记列表;采集所述目标对象在预设时间范围内的行为数据,并按照预设的行为评定规则确定所述行为数据所属的类别;生成包括所述目标对象的信息以及所述行为数据所属的类别的文件。
有益效果
综上所述,电子设备可以获取目标对象在目标场景下的风险数据,并根据该风险数据以及该目标场景的风险预测模型,确定该目标对象的风险预测结果;电子设备可以当该风险预测结果指示该目标对象存在风险时,将该目标对象的信息添加至标记列表,并采集该目标对象在预设时间范围内的行为数据,按照预设的行为判定规则确定该行为数据所属的类别,并生成包括该目标对象的信息以及该行为数据所属的类别的文件,能够使得风险识别过程更具针对性,并提高风险识别的准确度。
附图说明
图1是本申请实施例提供的一种基于数据分析的风险识别方法的流程示意图。
图2是本申请实施例提供的另一种基于数据分析的风险识别方法的流程示意图。
图3是本申请实施例提供的一种基于数据分析的风险识别装置的结构示意图。
图4是本申请实施例提供的一种电子设备的结构示意图。
本发明的实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。
请参阅图1,为本申请实施例提供的一种基于数据分析的风险识别方法的流程示意图。该方法可以应用于电子设备。电子设备可以为终端或服务器。终端可以为平板电脑、笔记本电脑、台式电脑。服务器可以为一个服务器或服务器集群。具体地,该方法可以包括以下步骤。
S101、获取目标对象在目标场景下的风险数据。
其中,该目标对象可以为任一对象,或还可以为处于该目标场景下的任一对象,或还可以为处于该目标场景下的需要风险识别的任一对象,或还可以为输入或搜索的处于该目标场景下的需要风险识别的任一对象。该对象包括但不限于人。该目标场景可以为任一场景,或还可以为需要进行风险监管的任一场景,或还可以为多个场景中事故发生频率大于或等于预设频率的场景。在一个实施例中,根据场景的不同,该对象还可以进一步细分。例如,在车祸风险场景下,该对象包括但不限于司机等对象。在反洗钱场景下,该对象包括但不限于相应机构的客户或工作人员等对象。在稽核场景下,该对象包括但不限于相应机构的客户或工作人员等对象。
其中,该风险数据包括至少一个用于风险预测的标签。例如,该标签可以为关键字。例如,车祸风险场景下的风险数据可以包括所在区域的天气、行车路况、驾驶人信息(如驾驶人车龄和/或车辆信息)等标签。
在一个实施例中,电子设备可以从目标场景对应的信息服务器,获取该目标对象在目标场景下的风险数据。例如,在车祸风险场景下,该信息服务器包括但不限于以下至少一项:交通管理服务器、气象服务器和地图服务器。电子设备可以从交通管理服务器获取驾驶人信息,从气象服务器获取所述目标对象所在区域的天气信息,从交通管理服务器或地图服务器获取行车路况信息。
在一个实施例中,电子设备可以发送风险数据获取请求至所述目标场景对应的信息服务器,并接收该信息服务器响应该风险数据获取请求返回的该目标对象在目标场景下的风险数据。
在一个实施例中,电子设备获取多个对象在目标场景下的风险数据,并从该多个对象在目标场景下的风险数据中,查询出目标对象在目标场景下的风险数据。
在一个实施例中,电子设备获取目标对象在目标场景下的风险信息集合,该风险信息集合包括至少一个用于风险预测的信息,电子设备可以对该风险信息集合中的各个信息进行标签提取,得到该目标对象在该目标场景下的风险数据。
在一个实施例中,电子设备可以从目标场景对应的信息服务器,获取该目标对象在目标场景下的风险信息集合。
S102、根据所述风险数据以及所述目标场景的风险预测模型,确定所述目标对象的风险预测结果。
其中,该风险预设模型根据该目标场景下的至少一组组合标签构建,该至少一组组合标签中每组组合标签包括多个标签;该多个标签包括的各标签之间通过逻辑连接词连接。在一个实施例中,该逻辑连接词可以为“并且”和/或“或者”。例如,该至少一组组合标签包括第一组组合标签和第二组组合标签。该第一组组合标签为(标签1且标签2且标签3),该第二组组合标签为(标签1或标签2或标签3)。
在一个实施例中,该风险预测模型可以为该至少一组组合标签中每组组合标签和对应的风险预测结果之间的对应关系,或还可以为在将该至少一组组合标签以及该至少一组组合标签中每组组合标签对应的风险预测结果,输入到预设模型进行训练后,得到的训练后的该预设模型。其中,该风险预测结果可以为事故发生率。当该风险预测结果为事故发生率时,若事故发生率大于或等于预设值,则该风险预测结果指示该目标对象存在风险。该风险预测结果还可以为指示是否存在风险的结果或指示是否发生事故的结果,该结果包括但不限于以数字、文字、字母等形式呈现。当该风险预测结果为指示存在风险的结果或指示发生事故的结果时,该风险预测结果指示该目标对象存在风险。在一个实施例中,该风险预测结果还可以包括风险事故的类别。
在一个实施例中,电子设备根据该风险数据以及该目标场景的风险预测模型,确定该目标对象的风险预测结果,可以包括:电子设备从该每组组合标签和对应的风险预测结果之间的对应关系中,查询出该风险数据对应的风险预测结果,并将该风险数据对应的风险预测结果,确定为该目标对象的风险预测结果。本申请实施例通过查询对应关系,能够便捷地确定出该目标对象的风险预测结果。
在一个实施例中,电子设备根据该风险数据以及该目标场景的风险预测模型,确定该目标对象的风险预测结果,可以包括:电子设备将该风险数据输入到该训练后的该预设模型以进行风险预测,并通过该训练后的该预设模型输出该目标对象的风险预测结果。本申请实施例,通过模型得到目标对象的风险预测结果,能够快速准确地确定出该目标对象的风险预测结果。
S103、当所述风险预测结果指示所述目标对象存在风险时,将所述目标对象的信息添加至标记列表。
本申请实施例中,电子设备可以当该风险预测结果指示该目标对象存在风险时,将该目标对象的信息添加至标记列表。通过将该目标对象的信息添加至标记列表,可以将该目标对象列为重点监管对象,能够更好的对该目标场景下存在风险的对象进行统一管理,并有利于后续对该目标对象进行进一步跟进,掌握该目标对象的行为动向。
其中,该目标对象的信息可以包括该目标对象的标识,如该目标对象的姓名、图像(如头像)、身份证号、联系方式等用于唯一标识该目标对象的信息。在一个实施例中,该目标对象的信息还可以包括该目标对象的其他信息,如该目标对象的工作地点、居住地区等信息,本申请实施例在此不一一列举。其中,该标记列表可以用于记录该目标场景下存在风险的对象的信息。例如,该标记列表可以用于记录在车祸风险场景下存在车祸风险的用户的信息。
S104、采集所述目标对象在预设时间范围内的行为数据,并按照预设的行为评定规则确定所述行为数据所属的类别。
其中,该预设时间范围可以为任一时间范围,具体可以根据实际需求设置。例如,可以为以当前时间开始统计的过去三个月,和/或还可以为以当前时间开始统计的未来三个月。该行为数据根据场景的不同,可以存在不同。例如,在车祸风险场景下,该行为数据可以包括行车数据等数据。该行车数据包括但不限于拍摄或记录的该目标对象的行车记录,如行车违规数据。该行车违规数据包括但不限于以下至少一项:闯红灯记录、追尾记录、超速记录、闯人行横道记录。在反洗钱场景下,该行为数据可以包括交易数据等数据。可选地,该类别可以为等级,如一级、二级、三级。其中,一级指示的行为严重性低于二级指示的行为严重性,二级指示的行为严重性低于三级指示的行为严重性。再如,该类别还可以为失德、违法、入罪。
在一个实施例中,电子设备可以从信息服务器采集该目标对象在预设时间范围内的行为数据。例如,在车祸风险场景下,电子设备可以从交通管理服务器采集该目标对象的行车数据。
在一个实施例中,电子设备按照预设行为判定规则确定所述行为数据所属的类别,可以包括:电子设备将所述行为数据输入到预设的分类模型中,经由所述分类模型对所述行为数据进行分类,得到所述行为数据所属的类别。其中该,该分类模型可以是在利用搜集的行为数据训练集以及该行为数据训练集中各行为数据各自所属的类别,对指定模型进行训练后,得到的训练后的所述指定模型。采用上述方式,使得电子设备能够通过分类模型准确快速地确定行为数据所属的类别。
在一个实施例中,电子设备按照预设行为判定规则确定所述行为数据所属的类别,还可以包括:电子设备对所述行为数据进行命名实体识别以提取所述行为数据中各个实体,并对所述行为数据进行语义分析,得到各个实体间的关联关系,将所述各个实体、所述各个实体间的关联关系与不同类别下的行为判定数据进行匹配,根据匹配结果确定所述行为数据所属的类别。上述关联关系可以包括行为特征。该行为判定数据可以包括法律条文集合或法律规则集合。在一个实施例中,该行为判定数据还可以包括案例集合,或包括与该法律条文对应的案例集合,或包括与该法律规则对应的案例集合。
例如,行为数据为“司机A闯了红灯”,则对所述行为数据进行命名实体识别以提取该行为数据中的各实体:“司机A”和“红灯”,并对所述行为数据进行语义分析,得到各个实体间的关联关系“闯”。电子设备将“司机A”“闯”“红灯”与不同类别下的行为判定数据(如法律规则)进行匹配,根据匹配结果确定该行为数据所属的类别为违法。
在一个实施例中,该匹配结果可以为匹配到的法律条文或法律规则,电子设备可以将该匹配到的法律条文对应的类别或该法律规则对应的类别,确定为该行为数据所属的类别。在一个实施例中,该匹配到的法律条文或法律规则可以是确定出的与该行为数据匹配度最高的法律条文或法律规则。或,该匹配结果还可以为匹配到的案例,电子设备可以将该匹配到的案例对应的类别,确定为该行为数据所属的类别。在一个实施例中,该匹配到的案例可以是确定出的与该行为数据匹配度最高的案例。在一个实施例中,电子设备还可以将该匹配到的案例对应的法律条文或法律规则对应的类别,确定为该行为数据所属的类别。
在一个实施例中,该各个实体包括该行为数据中各条数据各自对应的主体和客体。该各个实体间的关联关系包括该行为数据中各条数据各自对应的主体和客体之间的关联关系。通过确定出各条数据对应的主体和客体,能够有效地避免对无用实体的识别,从而调高确定行为数据所属的类型的效率。
S105、生成包括所述目标对象的信息以及所述行为数据所属的类别的文件。
本申请实施例中,电子设备可以生成包括该目标对象的信息以及该行为数据所属的类别的文件,以便对该目标对象的信息以及该行为数据所属的类别进行存档和查询。其中,该文件的格式包括但不限于以下任一项:doc、docx、pdf、excel。在一个实施例中,该文件可以为分析报告。
在一个实施例中,该电子设备还可以生成包括该目标对象的信息、该目标对象的风险预测结果、该行为数据所属的类别的文件。
在一个实施例中,该电子设备还可以生成包括该目标对象的信息、该目标对象的风险预测结果、该行为数据所属的类别以及其它辅助判定数据的文件。其中,该其它辅助判定数据可以是指除了该行为数据之外的一些其它维度的数据。例如,在车祸风险场景下,该其它辅助判断数据可以包括拍摄到的酗酒视频、出入酒吧等场所的视频等数据。
在一个实施例中,该其它辅助判定数据可以用于辅助分析所述行为数据属于所述类别的原因。即,电子设备可以根据该其它辅助判定数据,确定该行为数据属于该类别的原因。或,该其它辅助判定数据还可以用于分析该行为数据属于该类别下的哪个子类别。即,电子设备还可以根据该其它辅助判定数据,确定该行为数据属于该类别下的目标子类别。
在一个实施例中,当电子设备为终端时,该电子设备可以显示该文件。当该电子设备为服务器时,该电子设备可以将该文件发送至相应设备进行显示。
在一个实施例中,电子设备还可以当该风险预测结果指示该目标对象存在风险时,输出第一警报信息以提示相关人员。电子设备还可以当确定所属行为数据所属的类别为指定类别时,输出第二警报信息以提示相关人员。
在一个实施例中,电子设备还可以当在预设事件表中查询到该目标对象的信息时,触发按照预设的行为判定规则确定该行为数据所属的类别的步骤。其中,该预设事件表记录了在该目标场景下发生了事故的对象的信息。采用上述方式,能够便于相关人员对于该事故进行定性。例如,在车祸风险场景下,该预设事件表可以为该交通管理服务器在预设时间段内记录的交通事故数据或立案数据等数据。
可见,图1所示的实施例中,电子设备可以获取目标对象在目标场景下的风险数据,并根据该风险数据以及该目标场景的风险预测模型,确定该目标对象的风险预测结果;电子设备可以当该风险预测结果指示该目标对象存在风险时,将该目标对象的信息添加至标记列表,并采集该目标对象在预设时间范围内的行为数据,按照预设的行为判定规则确定该行为数据所属的类别,并生成包括该目标对象的信息以及该行为数据所属的类别的文件,使得风险识别过程更具针对性,并提高了风险识别的准确度。
请参阅图2,为本申请实施例提供的另一种基于数据分析的风险识别方法的流程示意图。该方法可以应用于电子设备,该电子设备可以为终端或服务器。具体地,该方法可以包括以下步骤。
S201、获取目标场景的场景标识,根据预设的场景标识与因子列表的对应关系,确定所述目标场景的场景标识对应的目标因子列表。
其中,该场景标识可以为场景名称。该目标因子列表是指该目标场景对应的因子列表,该目标因子列表可以包括一个或多个因子。例如,车祸风险场景对应的目标因子列表可以包括以下至少一项:天气、路况、驾驶人、驾驶人车龄。在一个实施例中,上述因子可以理解为对一类标签的统称。例如,天气可以是对诸如晴天、阴天、小雨、暴雨等形容天气的标签的统称。
在一个实施例中,电子设备可以根据预设的场景标识与因子列表的对应关系,从因子库中确定出该目标场景的场景标识对应的目标因子列表。其中,该因子库保存了各场景对应的因子列表。
S202、从数据库中匹配出所述目标因子列表对应的标签列表。
其中,该数据库存有至少一个标签列表,以及该至少一个标签列表中每个标签列表与因子列表的对应关系。例如,该目标因子列表包括第一因子和第二因子,第一因子为路况,那么所述标签列表中与该第一因子相对应的标签可以为拥堵路况,流畅路况等路况,第二因子为驾驶人车龄,则所述标签列表中与该第二因子相对应的标签可以为驾驶人车龄2年、驾驶人车龄3年等驾驶人车龄。
本申请实施例中,电子设备可以根据数据库包括的该至少一个标签列表中每个标签列表与因子列表的对应关系,从数据库包括的至少一个标签列表中,确定出该目标因子列表对应的标签列表。
S203、对所述标签列表中各个标签进行组合处理,得到至少一组组合标签。
本申请实施例中,电子设备可以对标签列表中的各个标签进行组合处理,得到至少一组组合标签。
在一个实施例中,电子设备可以对该标签列表中各个标签进行随机采样,得到多组标签,该多组标签中每组标签包括多个标签;电子设备可以为该多组标签包括的各标签之间添加逻辑连接词,得到该多组标签中每组标签对应的至少一组组合标签。例如,多组标签包括第一组标签(标签1,标签2,标签3),为第一组标签中的各个标签之间添加连接词,至少可以得到第一组标签对应的以下两组组合标签:第一组组合标签(标签1且标签2且标签3),第二组组合标签(标签1或标签2或标签3)。可选地,上述随机采样可以是有放回采样。
可选地,考虑到有放回采样可能存在多组标签中有至少两组标签重复的问题,因此可以对所述多组标签进行去重处理。在一个实施例中,电子设备对所述标签列表中各个标签进行组合处理,得到至少一组组合标签,可以包括:电子设备对所述标签列表中各个标签进行随机采样,得到多组标签;所述多组标签中每组标签包括多个标签;电子设备对所述多组标签进行去重处理,得到至少一组标签;所述至少一组标签中各组标签之间存在不同;电子设备为所述至少一组标签中每组标签包括的各标签之间添加预设的逻辑连接词,得到所述每组标签对应的至少一组组合标签。其中,每组标签包括的各标签之间不重复。采用上述方式,能够有效地降低多组标签的重复率,进而降低每组标签对应的至少一组组合标签的重复率。
在一个实施例中,为了提高电子设备对某些标签的采样率,电子设备对所述标签列表中各个标签进行随机采样,可以包括:电子设备获取为所述标签列表中每个标签设置的权重;电子设备采用加权随机采样算法,根据为所述标签列表中每个标签设置的权重,对所述各个标签进行随机采样。在一个实施例中,权重越高,被采样的概率越高。
在一个实施例中,电子设备可以在去重处理过程中,每采样到一组标签,查询已保存的各组标签中是否包括该组标签;若已有的各组标签不包括该组标签,则保存该组标签;若已有的各组标签包括该组标签,则丢弃(如删除)该组标签。
在一个实施例中,电子设备对所述标签列表中各个标签进行组合处理,得到至少一组组合标签,可以包括:电子设备采用递归算法对所述标签列表中各个标签进行排列组合,得到多组标签,每组标签包括多个标签;电子设备为所述多组标签包括的各标签之间添加预设的逻辑连接词,得到每组标签对应的至少一组组合标签。本申请采用递归算法得到多组标签,可以提升得到多组标签的效率。
例如,该标签列表中包括n个标签,电子设备可以从中选取m个标签,并对这m个标签进行排列组合,得到这m个标签对应的多组标签,采用上述递归算法得到该多组标签的过程,可以包括:选取所述标签列表中的第一标签作为一组标签的第一个元素,并从所述标签列表的位于所述第一标签之后的其它标签中选取(m-1)个标签作为一组标签的剩余(m-1)个元素,以此类推,选取所述标签列表中的第2至(n-m+1)标签作为一组标签的第一个元素,并从所述标签列表中位于所述第2至(n-m+1)标签之后的其它标签中选取(m-1)个标签作为一组标签的剩余(m-1)个元素。
在一个实施例中,前述标签列表中各个标签可以是按照各自对应的因子进行分组后,得到的不同因子对应的至少一个标签。电子设备在随机采样的过程中,每次从每个因子对应的至少一个标签中抽取一个标签,以构建一组标签。因此,前述每组标签包括的多个标签中各个标签对应的因子不同。
在一个实施例中,电子设备还可以对该每组标签对应的至少一组组合标签进行去重处理。该去重处理的过程,可以为删除该至少一组组合标签中组合标签相同的组。
S204、利用所述至少一组组合标签,构建所述目标场景的风险预测模型。
本申请实施例中,电子设备可以利用该至少一组组合标签,构建该目标场景的风险预测模型。
在一个实施例中,电子设备利用所述至少一组组合标签,构建所述目标场景的风险预测模型,可以包括:电子设备建立所述至少一组组合标签中每组组合标签和对应的风险预测结果之间的对应关系,并将所述每组组合标签和对应的风险预测结果之间的对应关系,确定为所述目标场景的风险预测模型。将对应关系确定为风险预测模型,能够快速有效地确定风险预测模型。
在一个实施例中,电子设备利用所述至少一组组合标签,构建所述目标场景的风险预测模型,可以包括:电子设备将所述至少一组组合标签以及所述至少一组组合标签中每组组合标签对应的风险预测结果,输入到预设模型进行训练,得到训练后的所述预设模型,将所述训练后的所述预设模型确定为所述目标场景的风险预测模型。通过建模确定风险预测模型,能够提高该风险预测模型的可扩展性和预测准确度。
S205、获取目标对象在目标场景下的风险数据。
S206、根据所述风险数据以及所述目标场景的风险预测模型,确定所述目标对象的风险预测结果。
S207、当所述风险预测结果指示所述目标对象存在风险时,将所述目标对象的信息添加至标记列表。
S208、采集所述目标对象在预设时间范围内的行为数据,并按照预设的行为评定规则确定所述行为数据所属的类别。
S209、生成包括所述目标对象的信息以及所述行为数据所属的类别的文件。
其中,步骤S205-S209可参见图1实施例中的步骤S101-S105,本申请实施例中在此不做赘述。
可见,图2所示的实施例中,电子设备可以获取目标场景的场景标识,根据预设的场景标识与因子列表的对应关系,确定该目标场景的场景标识对应目标因子列表;电子设备可以从数据库中匹配出该目标因子列表对应的标签列表,以对该标签列表中各个标签进行组合处理,得到至少一组组合标签。从而利用该至少一组组合标签,构建该目标场景的风险预测模型,采用该方式能够快速有效地构建该目标场景的风险预测模型,实现了对该目标场景的风险预测模型的自动化智能化构建过程,提高了对该目标场景的风险预测模型的构建效率。
请参阅图3,为本申请实施例提供的一种基于数据分析的风险识别装置的结构示意图。该装置可以应用于电子设备。具体的,该装置可以包括以下单元。
获取单元301,用于获取目标对象在目标场景下的风险数据;所述风险数据包括至少一个用于风险预测的标签。
确定单元302,用于根据所述风险数据以及所述目标场景的风险预测模型,确定所述目标对象的风险预测结果;其中,所述风险预设模型根据所述目标场景下的至少一组组合标签构建,所述至少一组组合标签中每组组合标签包括多个标签;所述多个标签包括的各标签之间通过逻辑连接词连接。
添加单元303,用于当所述风险预测结果指示所述目标对象存在风险时,将所述目标对象的信息添加至标记列表。
处理单元304,用于采集所述目标对象在预设时间范围内的行为数据,并按照预设的行为评定规则确定所述行为数据所属的类别,并生成包括所述目标对象的信息以及所述行为数据所属的类别的文件。
在一种可选的实施方式中,确定单元302,还用于通过获取单元301获取目标场景的场景标识,根据预设的场景标识与因子列表的对应关系,确定所述目标场景的场景标识对应的目标因子列表;所述目标因子列表包括一个或多个因子。
在一种可选的实施方式中,处理单元304,还用于从数据库中匹配出所述目标因子列表对应的标签列表;所述数据库存有至少一个标签列表,以及所述至少一个标签列表中每个标签列表与因子列表的对应关系;对所述标签列表中各个标签进行组合处理,得到至少一组组合标签;利用所述至少一组组合标签,构建所述目标场景的风险预测模型。
在一种可选的实施方式中,处理单元304对所述标签列表中各个标签进行组合处理,得到至少一组组合标签,具体为对所述标签列表中各个标签进行随机采样,得到多组标签;所述多组标签中每组标签包括多个标签;对所述多组标签进行去重处理,得到至少一组标签;所述至少一组标签中各组标签之间存在不同;为所述至少一组标签中每组标签包括的各标签之间添加预设的逻辑连接词,得到所述每组标签对应的至少一组组合标签。
在一种可选的实施方式中,处理单元304对所述标签列表中各个标签进行随机采样,具体为获取为所述标签列表中每个标签设置的权重;采用加权随机采样算法,根据为所述标签列表中每个标签设置的权重,对所述各个标签进行随机采样。
在一种可选的实施方式中,处理单元304对所述标签列表中各个标签进行组合处理,得到至少一组组合标签,具体为采用递归算法对所述标签列表中各个标签进行排列组合,得到多组标签,每组标签包括多个标签;为所述多组标签包括的各标签之间添加预设的逻辑连接词,得到每组标签对应的至少一组组合标签。
在一种可选的实施方式中,处理单元304利用所述至少一组组合标签,构建所述目标场景的风险预测模型,具体为建立所述至少一组组合标签中每组组合标签和对应的风险预测结果之间的对应关系,并将所述每组组合标签和对应的风险预测结果之间的对应关系,确定为所述目标场景的风险预测模型;或,将所述至少一组组合标签以及所述至少一组组合标签中每组组合标签对应的风险预测结果,输入到预设模型进行训练,得到训练后的所述预设模型,将所述训练后的所述预设模型确定为所述目标场景的风险预测模型。
在一种可选的实施方式中,处理单元304按照预设行为判定规则确定所述行为数据所属的类别,具体为将所述行为数据输入到预设的分类模型中,经由所述分类模型对所述行为数据进行分类,得到所述行为数据所属的类别;或,对所述行为数据进行命名实体识别以提取所述行为数据中各个实体,并对所述行为数据进行语义分析,得到各个实体间的关联关系,将所述各个实体、所述各个实体间的关联关系与不同类别下的行为判定数据进行匹配,根据匹配结果确定所述行为数据所属的类别。
可见,图3所示的实施例中,电子设备可以获取目标对象在目标场景下的风险数据,并根据该风险数据以及该目标场景的风险预测模型,确定该目标对象的风险预测结果;电子设备可以当该风险预测结果指示该目标对象存在风险时,将该目标对象的信息添加至标记列表,并采集该目标对象在预设时间范围内的行为数据,按照预设的行为判定规则确定该行为数据所属的类别,并生成包括该目标对象的信息以及该行为数据所属的类别的文件,能够使得风险识别过程更具针对性,并提高风险识别的准确度。
请参阅图4,为本申请实施例提供的一种电子设备的结构示意图。其中,本实施例中所描述的电子设备可以包括处理器1000和存储器2000。处理器1000和存储器2000之间可以通过如图4所示的总线或其它方式连接。在一个实施例中,该电子设备还可以包括一个或多个输入设备3000、一个或多个输出设备4000。处理器1000、存储器2000、一个或多个输入设备3000和一个或多个输出设备4000之间可以通过总线或其它方式连接。在一个实施例中,输入设备3000包括但不限于触摸屏、录音器、传感器等设备。输出设备4000包括但不限于显示屏、扬声器等设备。该触摸屏和显示屏还可以替换为触摸显示屏。在一个实施例中,输入设备3000和输出设备4000可以包括标准的有线或无线通信接口。
处理器1000可以是中央处理模块(Central Processing Unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
存储器2000可以是高速RAM存储器,也可为非不稳定的存储器(non-volatile memory),例如磁盘存储器。存储器2000用于存储一组程序代码,处理器1000、输入设备3000和输出设备4000可以调用存储器2000中存储的程序代码。具体地:处理器1000,用于获取目标对象在目标场景下的风险数据;所述风险数据包括至少一个用于风险预测的标签;根据所述风险数据以及所述目标场景的风险预测模型,确定所述目标对象的风险预测结果;其中,所述风险预设模型根据所述目标场景下的至少一组组合标签构建,所述至少一组组合标签中每组组合标签包括多个标签;所述多个标签包括的各标签之间通过逻辑连接词连接;当所述风险预测结果指示所述目标对象存在风险时,将所述目标对象的信息添加至标记列表;采集所述目标对象在预设时间范围内的行为数据,并按照预设的行为评定规则确定所述行为数据所属的类别;生成包括所述目标对象的信息以及所述行为数据所属的类别的文件。
可选地,处理器1000,还用于获取目标场景的场景标识,根据预设的场景标识与因子列表的对应关系,确定所述目标场景的场景标识对应的目标因子列表;所述目标因子列表包括一个或多个因子;从数据库中匹配出所述目标因子列表对应的标签列表;所述数据库存有至少一个标签列表,以及所述至少一个标签列表中每个标签列表与因子列表的对应关系;对所述标签列表中各个标签进行组合处理,得到至少一组组合标签;利用所述至少一组组合标签,构建所述目标场景的风险预测模型。
可选地,处理器1000对所述标签列表中各个标签进行组合处理,得到至少一组组合标签,具体为对所述标签列表中各个标签进行随机采样,得到多组标签;所述多组标签中每组标签包括多个标签;对所述多组标签进行去重处理,得到至少一组标签;所述至少一组标签中各组标签之间存在不同;为所述至少一组标签中每组标签包括的各标签之间添加预设的逻辑连接词,得到所述每组标签对应的至少一组组合标签。
可选地,处理器1000对所述标签列表中各个标签进行随机采样,具体为获取为所述标签列表中每个标签设置的权重;采用加权随机采样算法,根据为所述标签列表中每个标签设置的权重,对所述各个标签进行随机采样。
可选地,处理器1000对所述标签列表中各个标签进行组合处理,得到至少一组组合标签,具体为采用递归算法对所述标签列表中各个标签进行排列组合,得到多组标签,每组标签包括多个标签;为所述多组标签包括的各标签之间添加预设的逻辑连接词,得到每组标签对应的至少一组组合标签。
可选地,处理器1000利用所述至少一组组合标签,构建所述目标场景的风险预测模型,具体为建立所述至少一组组合标签中每组组合标签和对应的风险预测结果之间的对应关系,并将所述每组组合标签和对应的风险预测结果之间的对应关系,确定为所述目标场景的风险预测模型;或,将所述至少一组组合标签以及所述至少一组组合标签中每组组合标签对应的风险预测结果,输入到预设模型进行训练,得到训练后的所述预设模型,将所述训练后的所述预设模型确定为所述目标场景的风险预测模型。
可选地,处理器1000按照预设行为判定规则确定所述行为数据所属的类别,具体为将所述行为数据输入到预设的分类模型中,经由所述分类模型对所述行为数据进行分类,得到所述行为数据所属的类别;或,对所述行为数据进行命名实体识别以提取所述行为数据中各个实体,并对所述行为数据进行语义分析,得到各个实体间的关联关系,将所述各个实体、所述各个实体间的关联关系与不同类别下的行为判定数据进行匹配,根据匹配结果确定所述行为数据所属的类别。
具体实现中,本申请实施例中所描述的处理器1000、输入设备3000、输出设备4000可执行图1和图2实施例所描述的实现方式,也可执行本申请实施例所描述的实现方式,在此不再赘述。
在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采样硬件的形式实现,也可以采样软件功能单元的形式实现。
本申请实施例还提供了一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行以实现以下步骤:获取目标对象在目标场景下的风险数据;所述风险数据包括至少一个用于风险预测的标签;根据所述风险数据以及所述目标场景的风险预测模型,确定所述目标对象的风险预测结果;其中,所述风险预设模型根据所述目标场景下的至少一组组合标签构建,所述至少一组组合标签中每组组合标签包括多个标签;所述多个标签包括的各标签之间通过逻辑连接词连接;当所述风险预测结果指示所述目标对象存在风险时,将所述目标对象的信息添加至标记列表;采集所述目标对象在预设时间范围内的行为数据,并按照预设的行为评定规则确定所述行为数据所属的类别;生成包括所述目标对象的信息以及所述行为数据所属的类别的文件。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的计算机可读存储介质可为非易失性的,也可为易失性的。该计算机可读存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。
以上所揭露的仅为本申请一种较佳实施例而已,当然不能以此来限定本申请之权利范围,本领域普通技术人员可以理解实现上述实施例的全部或部分流程,并依本申请权利要求所作的等同变化,仍属于本申请所涵盖的范围。

Claims (20)

  1. 一种基于数据分析的风险识别方法,其中,包括:
    获取目标对象在目标场景下的风险数据;所述风险数据包括至少一个用于风险预测的标签;
    根据所述风险数据以及所述目标场景的风险预测模型,确定所述目标对象的风险预测结果;其中,所述风险预设模型根据所述目标场景下的至少一组组合标签构建,所述至少一组组合标签中每组组合标签包括多个标签;所述多个标签包括的各标签之间通过逻辑连接词连接;
    当所述风险预测结果指示所述目标对象存在风险时,将所述目标对象的信息添加至标记列表;
    采集所述目标对象在预设时间范围内的行为数据,并按照预设的行为评定规则确定所述行为数据所属的类别;
    生成包括所述目标对象的信息以及所述行为数据所属的类别的文件。
  2. 根据权利要求1所述的方法,其中,所述方法还包括:
    获取目标场景的场景标识,根据预设的场景标识与因子列表的对应关系,确定所述目标场景的场景标识对应的目标因子列表;所述目标因子列表包括一个或多个因子;
    从数据库中匹配出所述目标因子列表对应的标签列表;所述数据库存有至少一个标签列表,以及所述至少一个标签列表中每个标签列表与因子列表的对应关系;
    对所述标签列表中各个标签进行组合处理,得到至少一组组合标签;
    利用所述至少一组组合标签,构建所述目标场景的风险预测模型。
  3. 根据权利要求2所述的方法,其中,所述对所述标签列表中各个标签进行组合处理,得到至少一组组合标签,包括:
    对所述标签列表中各个标签进行随机采样,得到多组标签;所述多组标签中每组标签包括多个标签;
    对所述多组标签进行去重处理,得到至少一组标签;所述至少一组标签中各组标签之间存在不同;
    为所述至少一组标签中每组标签包括的各标签之间添加预设的逻辑连接词,得到所述每组标签对应的至少一组组合标签。
  4. 根据权利要求3所述的方法,其中,所述对所述标签列表中各个标签进行随机采样,包括:
    获取为所述标签列表中每个标签设置的权重;
    采用加权随机采样算法,根据为所述标签列表中每个标签设置的权重,对所述各个标签进行随机采样。
  5. 根据权利要求2所述的方法,其中,所述对所述标签列表中各个标签进行组合处理,得到至少一组组合标签,包括:
    采用递归算法对所述标签列表中各个标签进行排列组合,得到多组标签,每组标签包括多个标签;
    为所述多组标签包括的各标签之间添加预设的逻辑连接词,得到每组标签对应的至少一组组合标签。
  6. 根据权利要求2所述的方法,其中,所述利用所述至少一组组合标签,构建所述目标场景的风险预测模型,包括:
    建立所述至少一组组合标签中每组组合标签和对应的风险预测结果之间的对应关系,并将所述每组组合标签和对应的风险预测结果之间的对应关系,确定为所述目标场景的风险预测模型;或,
    将所述至少一组组合标签以及所述至少一组组合标签中每组组合标签对应的风险预测结果,输入到预设模型进行训练,得到训练后的所述预设模型,将所述训练后的所述预设模型确定为所述目标场景的风险预测模型。
  7. 根据权利要求1所述的方法,其中,所述按照预设行为判定规则确定所述行为数据所属的类别,包括:
    将所述行为数据输入到预设的分类模型中,经由所述分类模型对所述行为数据进行分类,得到所述行为数据所属的类别;或,
    对所述行为数据进行命名实体识别以提取所述行为数据中各个实体,并对所述行为数据进行语义分析,得到各个实体间的关联关系,将所述各个实体、所述各个实体间的关联关系与不同类别下的行为判定数据进行匹配,根据匹配结果确定所述行为数据所属的类别。
  8. 一种基于数据分析的风险识别装置,其中,包括:
    获取单元,用于获取目标对象在目标场景下的风险数据;所述风险数据包括至少一个用于风险预测的标签;
    确定单元,用于根据所述风险数据以及所述目标场景的风险预测模型,确定所述目标对象的风险预测结果;其中,所述风险预设模型根据所述目标场景下的至少一组组合标签构建,所述至少一组组合标签中每组组合标签包括多个标签;所述多个标签包括的各标签之间通过逻辑连接词连接;
    添加单元,用于当所述风险预测结果指示所述目标对象存在风险时,将所述目标对象的信息添加至标记列表;
    处理单元,用于采集所述目标对象在预设时间范围内的行为数据,并按照预设的行为评定规则确定所述行为数据所属的类别,并生成包括所述目标对象的信息以及所述行为数据所属的类别的文件。
  9. 一种电子设备,其中,包括处理器和存储器,所述处理器和所述存储器相互连接,其中,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,执行以下步骤:
    获取目标对象在目标场景下的风险数据;所述风险数据包括至少一个用于风险预测的标签;
    根据所述风险数据以及所述目标场景的风险预测模型,确定所述目标对象的风险预测结果;其中,所述风险预设模型根据所述目标场景下的至少一组组合标签构建,所述至少一组组合标签中每组组合标签包括多个标签;所述多个标签包括的各标签之间通过逻辑连接词连接;
    当所述风险预测结果指示所述目标对象存在风险时,将所述目标对象的信息添加至标记列表;
    采集所述目标对象在预设时间范围内的行为数据,并按照预设的行为评定规则确定所述行为数据所属的类别;
    生成包括所述目标对象的信息以及所述行为数据所属的类别的文件。
  10. 根据权利要求9所述的电子设备,其中,所述处理器,还用于获取目标场景的场景标识,根据预设的场景标识与因子列表的对应关系,确定所述目标场景的场景标识对应的目标因子列表;所述目标因子列表包括一个或多个因子;从数据库中匹配出所述目标因子列表对应的标签列表;所述数据库存有至少一个标签列表,以及所述至少一个标签列表中每个标签列表与因子列表的对应关系;对所述标签列表中各个标签进行组合处理,得到至少一组组合标签;利用所述至少一组组合标签,构建所述目标场景的风险预测模型。
  11. 根据权利要求10所述的电子设备,其中,所述处理器对所述标签列表中各个标签进行组合处理,得到至少一组组合标签,具体为对所述标签列表中各个标签进行随机采样,得到多组标签;所述多组标签中每组标签包括多个标签;对所述多组标签进行去重处理,得到至少一组标签;所述至少一组标签中各组标签之间存在不同;为所述至少一组标签中每组标签包括的各标签之间添加预设的逻辑连接词,得到所述每组标签对应的至少一组组合标签。
  12. 根据权利要求11所述的电子设备,其中,所述处理器对所述标签列表中各个标签进行随机采样,具体为获取为所述标签列表中每个标签设置的权重;采用加权随机采样算法,根据为所述标签列表中每个标签设置的权重,对所述各个标签进行随机采样。
  13. 根据权利要求10所述的电子设备,其中,所述处理器对所述标签列表中各个标签进行组合处理,得到至少一组组合标签,具体为采用递归算法对所述标签列表中各个标签进行排列组合,得到多组标签,每组标签包括多个标签;为所述多组标签包括的各标签之间添加预设的逻辑连接词,得到每组标签对应的至少一组组合标签。
  14. 根据权利要求10所述的电子设备,其中,所述处理器利用所述至少一组组合标签,构建所述目标场景的风险预测模型,具体为建立所述至少一组组合标签中每组组合标签和对应的风险预测结果之间的对应关系,并将所述每组组合标签和对应的风险预测结果之间的对应关系,确定为所述目标场景的风险预测模型;或,将所述至少一组组合标签以及所述至少一组组合标签中每组组合标签对应的风险预测结果,输入到预设模型进行训练,得到训练后的所述预设模型,将所述训练后的所述预设模型确定为所述目标场景的风险预测模型。
  15. 根据权利要求9所述的电子设备,其中,所述处理器按照预设行为判定规则确定所述行为数据所属的类别,具体为将所述行为数据输入到预设的分类模型中,经由所述分类模型对所述行为数据进行分类,得到所述行为数据所属的类别;或,对所述行为数据进行命名实体识别以提取所述行为数据中各个实体,并对所述行为数据进行语义分析,得到各个实体间的关联关系,将所述各个实体、所述各个实体间的关联关系与不同类别下的行为判定数据进行匹配,根据匹配结果确定所述行为数据所属的类别。
  16. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行以实现以下步骤:
    获取目标对象在目标场景下的风险数据;所述风险数据包括至少一个用于风险预测的标签;
    根据所述风险数据以及所述目标场景的风险预测模型,确定所述目标对象的风险预测结果;其中,所述风险预设模型根据所述目标场景下的至少一组组合标签构建,所述至少一组组合标签中每组组合标签包括多个标签;所述多个标签包括的各标签之间通过逻辑连接词连接;
    当所述风险预测结果指示所述目标对象存在风险时,将所述目标对象的信息添加至标记列表;
    采集所述目标对象在预设时间范围内的行为数据,并按照预设的行为评定规则确定所述行为数据所属的类别;
    生成包括所述目标对象的信息以及所述行为数据所属的类别的文件。
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述计算机程序被处理器执行,还用于实现以下步骤:
    获取目标场景的场景标识,根据预设的场景标识与因子列表的对应关系,确定所述目标场景的场景标识对应的目标因子列表;所述目标因子列表包括一个或多个因子;
    从数据库中匹配出所述目标因子列表对应的标签列表;所述数据库存有至少一个标签列表,以及所述至少一个标签列表中每个标签列表与因子列表的对应关系;
    对所述标签列表中各个标签进行组合处理,得到至少一组组合标签;
    利用所述至少一组组合标签,构建所述目标场景的风险预测模型。
  18. 根据权利要求17所述的计算机可读存储介质,其中,在对所述标签列表中各个标签进行组合处理,得到至少一组组合标签时,所述计算机程序被处理器执行以实现以下步骤:
    对所述标签列表中各个标签进行随机采样,得到多组标签;所述多组标签中每组标签包括多个标签;
    对所述多组标签进行去重处理,得到至少一组标签;所述至少一组标签中各组标签之间存在不同;
    为所述至少一组标签中每组标签包括的各标签之间添加预设的逻辑连接词,得到所述每组标签对应的至少一组组合标签。
  19. 根据权利要求18所述的计算机可读存储介质,其中,在对所述标签列表中各个标签进行随机采样时,所述计算机程序被处理器执行以实现以下步骤:
    获取为所述标签列表中每个标签设置的权重;
    采用加权随机采样算法,根据为所述标签列表中每个标签设置的权重,对所述各个标签进行随机采样。
  20. 根据权利要求17所述的计算机可读存储介质,其中,在对所述标签列表中各个标签进行组合处理,得到至少一组组合标签时,所述计算机程序被处理器执行以实现以下步骤:
    采用递归算法对所述标签列表中各个标签进行排列组合,得到多组标签,每组标签包括多个标签;
    为所述多组标签包括的各标签之间添加预设的逻辑连接词,得到每组标签对应的至少一组组合标签。
PCT/CN2020/099556 2019-07-10 2020-06-30 基于数据分析的风险识别方法及相关设备 WO2021004344A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910619081.6 2019-07-10
CN201910619081.6A CN110428091B (zh) 2019-07-10 2019-07-10 基于数据分析的风险识别方法及相关设备

Publications (1)

Publication Number Publication Date
WO2021004344A1 true WO2021004344A1 (zh) 2021-01-14

Family

ID=68409194

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/099556 WO2021004344A1 (zh) 2019-07-10 2020-06-30 基于数据分析的风险识别方法及相关设备

Country Status (2)

Country Link
CN (1) CN110428091B (zh)
WO (1) WO2021004344A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113505713A (zh) * 2021-07-16 2021-10-15 上海塞嘉电子科技有限公司 一种基于机场安全管理平台的视频智能分析方法及系统
CN115146725A (zh) * 2022-06-30 2022-10-04 北京百度网讯科技有限公司 对象分类模式的确定方法、对象分类方法、装置和设备
CN115148028A (zh) * 2022-06-30 2022-10-04 北京小马智行科技有限公司 依据历史数据构建车辆路测场景的方法、装置及一种车辆
CN115274133A (zh) * 2022-07-15 2022-11-01 宝鸡市交通信息工程研究所 一种基于流调大数据的行踪识别方法

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110428091B (zh) * 2019-07-10 2022-12-27 平安科技(深圳)有限公司 基于数据分析的风险识别方法及相关设备
CN110826006B (zh) * 2019-11-22 2021-03-19 支付宝(杭州)信息技术有限公司 基于隐私数据保护的异常采集行为识别方法和装置
CN110942396A (zh) * 2019-11-28 2020-03-31 泰康保险集团股份有限公司 数据处理方法、装置及设备
CN111144658B (zh) * 2019-12-30 2023-06-16 医渡云(北京)技术有限公司 医疗风险预测方法、装置、系统、存储介质与电子设备
CN111339894A (zh) * 2020-02-20 2020-06-26 支付宝(杭州)信息技术有限公司 一种数据处理、风险识别方法、装置、设备及介质
CN111770095B (zh) * 2020-06-29 2023-04-18 百度在线网络技术(北京)有限公司 探测方法、装置、设备以及存储介质
CN112116401A (zh) * 2020-09-28 2020-12-22 中国建设银行股份有限公司 压力测试方法、装置、设备和存储介质
CN113312924A (zh) * 2021-06-23 2021-08-27 北京鼎泰智源科技有限公司 一种基于nlp高精解析标签的风险规则分类方法及装置
CN114996463B (zh) * 2022-07-18 2022-11-01 武汉大学人民医院(湖北省人民医院) 一种病例的智能分类方法和装置
CN116070916B (zh) * 2023-03-06 2023-06-16 支付宝(杭州)信息技术有限公司 数据处理方法、装置及设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108647743A (zh) * 2018-06-25 2018-10-12 江苏智通交通科技有限公司 驾驶人安全画像系统
CN109165840A (zh) * 2018-08-20 2019-01-08 平安科技(深圳)有限公司 风险预测处理方法、装置、计算机设备和介质
CN109272396A (zh) * 2018-08-20 2019-01-25 平安科技(深圳)有限公司 客户风险预警方法、装置、计算机设备和介质
CN109635335A (zh) * 2018-11-12 2019-04-16 平安科技(深圳)有限公司 驾驶风险预测方法、装置、计算机设备及存储介质
CN110428091A (zh) * 2019-07-10 2019-11-08 平安科技(深圳)有限公司 基于数据分析的风险识别方法及相关设备

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102622552A (zh) * 2012-04-12 2012-08-01 焦点科技股份有限公司 一种基于数据挖掘的b2b平台欺诈访问的检测方法和系统
CN106845747B (zh) * 2016-06-29 2020-12-04 国网浙江省电力公司宁波供电公司 基于电力客户标签的电费风险防控应用方法
CN107844548A (zh) * 2017-10-30 2018-03-27 北京锐安科技有限公司 一种数据标签方法和装置
CN108573339A (zh) * 2018-03-22 2018-09-25 昆明理工大学 一种多指标投影决策法的消费者网购风险评估方法
CN108694673A (zh) * 2018-05-16 2018-10-23 阿里巴巴集团控股有限公司 一种保险业务风险预测的处理方法、装置及处理设备
CN108876600B (zh) * 2018-08-20 2023-09-05 平安科技(深圳)有限公司 预警信息推送方法、装置、计算机设备和介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108647743A (zh) * 2018-06-25 2018-10-12 江苏智通交通科技有限公司 驾驶人安全画像系统
CN109165840A (zh) * 2018-08-20 2019-01-08 平安科技(深圳)有限公司 风险预测处理方法、装置、计算机设备和介质
CN109272396A (zh) * 2018-08-20 2019-01-25 平安科技(深圳)有限公司 客户风险预警方法、装置、计算机设备和介质
CN109635335A (zh) * 2018-11-12 2019-04-16 平安科技(深圳)有限公司 驾驶风险预测方法、装置、计算机设备及存储介质
CN110428091A (zh) * 2019-07-10 2019-11-08 平安科技(深圳)有限公司 基于数据分析的风险识别方法及相关设备

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113505713A (zh) * 2021-07-16 2021-10-15 上海塞嘉电子科技有限公司 一种基于机场安全管理平台的视频智能分析方法及系统
CN113505713B (zh) * 2021-07-16 2022-09-23 上海塞嘉电子科技有限公司 一种基于机场安全管理平台的视频智能分析方法及系统
CN115146725A (zh) * 2022-06-30 2022-10-04 北京百度网讯科技有限公司 对象分类模式的确定方法、对象分类方法、装置和设备
CN115148028A (zh) * 2022-06-30 2022-10-04 北京小马智行科技有限公司 依据历史数据构建车辆路测场景的方法、装置及一种车辆
CN115148028B (zh) * 2022-06-30 2023-12-15 北京小马智行科技有限公司 依据历史数据构建车辆路测场景的方法、装置及一种车辆
CN115274133A (zh) * 2022-07-15 2022-11-01 宝鸡市交通信息工程研究所 一种基于流调大数据的行踪识别方法

Also Published As

Publication number Publication date
CN110428091A (zh) 2019-11-08
CN110428091B (zh) 2022-12-27

Similar Documents

Publication Publication Date Title
WO2021004344A1 (zh) 基于数据分析的风险识别方法及相关设备
CN111768618B (zh) 基于城市画像的交通拥堵状态传播预测和预警系统及方法
US10575162B1 (en) Detecting and validating planned event information
CN110866642A (zh) 安全监控方法、装置、电子设备和计算机可读存储介质
CN108595582B (zh) 一种基于社会信号的灾害性气象热点事件识别方法
CN104573130A (zh) 基于群体计算的实体解析方法及装置
CN114003721A (zh) 矛盾纠纷事件类型分类模型的构建方法、装置及应用
CN111383004A (zh) 数字货币的实体位置提取方法、信息的提取方法及其装置
CN109995611B (zh) 流量分类模型建立及流量分类方法、装置、设备和服务器
CN113722611A (zh) 政务服务的推荐方法、装置、设备及计算机可读存储介质
WO2021027569A1 (zh) 执法分析方法、装置、电子设备及存储介质
CN112883734B (zh) 区块链安全事件舆情监测方法及系统
Weng et al. Cluster-based lognormal distribution model for accident duration
CN112818377A (zh) 权限数据推荐、权限设置方法及系统、电子设备及介质
Prathap et al. Crime analysis and forecasting on spatio temporal news feed data—an indian context
Khatun et al. Data mining technique to analyse and predict crime using crime categories and arrest records
CN117291428A (zh) 一种基于企业管理app的数据后台管理系统
CN112749239B (zh) 一种事件图谱构建方法、装置及计算设备
Nourbakhsh et al. " Breaking" Disasters: Predicting and Characterizing the Global News Value of Natural and Man-made Disasters
CN115619245A (zh) 一种基于数据降维方法的画像构建和分类方法及系统
Outay et al. Random forest models for motorcycle accident prediction using naturalistic driving based big data
CN114611841A (zh) 一种景区游客流量预测方法及装置
CN114648010A (zh) 数据表标准化方法、装置、设备及计算机存储介质
CN114218383A (zh) 重复事件的判定方法、装置及应用
CN112906725A (zh) 统计人流特征的方法、装置及服务器

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20836312

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 03/03/2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20836312

Country of ref document: EP

Kind code of ref document: A1