WO2021135105A1 - 基于大数据的对象识别方法、装置、设备及存储介质 - Google Patents

基于大数据的对象识别方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2021135105A1
WO2021135105A1 PCT/CN2020/098978 CN2020098978W WO2021135105A1 WO 2021135105 A1 WO2021135105 A1 WO 2021135105A1 CN 2020098978 W CN2020098978 W CN 2020098978W WO 2021135105 A1 WO2021135105 A1 WO 2021135105A1
Authority
WO
WIPO (PCT)
Prior art keywords
points
lbs
location information
data
object recognition
Prior art date
Application number
PCT/CN2020/098978
Other languages
English (en)
French (fr)
Inventor
喻宁
陈克炎
朱艳乔
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021135105A1 publication Critical patent/WO2021135105A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0204Market segmentation
    • G06Q30/0205Location or geographical consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to an object recognition method, device, equipment and storage medium based on big data.
  • object recognition is usually based on a single type of object data modeling, and the model is used to identify the object to be identified.
  • a single type of business area is usually collected.
  • Data for example, the flow of people in a business district
  • this method leads to low accuracy of object recognition.
  • this application provides a method, device, device and storage medium for object recognition based on big data, the purpose of which is the low accuracy of object recognition caused by the lack of processing of sample data in the prior art .
  • this application provides a method for object recognition based on big data, which includes:
  • Obtaining step Obtain the location information of the terminal devices of the preset user group, perform a data cleaning operation on the data of the location information, and filter the location information that belongs to the preset time period from the location information after the data cleaning is performed;
  • Clustering step perform a clustering operation on the location information within the preset time period based on a preset algorithm to obtain multiple location information clusters, set the multiple location information clusters into corresponding multiple target areas, and obtain them respectively Attribute characteristics of all points of interest in each target area;
  • Training step use preset labeling rules to label each of the target regions, generate a sample set based on the labeled target region and the attribute characteristics of all points of interest in each target region, and input the sample set into the random forest model Perform training to obtain an object recognition model;
  • Recognition step receiving an object recognition request from a certain user, analyzing and obtaining the attribute characteristics of the points of interest in the area to be recognized carried in the request, and inputting the attribute characteristics of the points of interest in the area to be recognized into the object recognition model to obtain The recognition result of the region to be recognized, and the recognition result is fed back to the user.
  • the present application also provides an object recognition device based on big data, the device includes:
  • Obtaining module used to obtain the location information of the terminal equipment of the preset user group, perform a data cleaning operation on the data of the location information, and filter out the location information belonging to the preset time period from the location information after the data cleaning is performed;
  • Clustering module used to perform a clustering operation on the location information within the preset time period based on a preset algorithm to obtain multiple location information clusters, and set the multiple location information clusters into corresponding multiple target areas, Obtain the attribute characteristics of all points of interest in each target area respectively;
  • Training module used to label each target area using preset labeling rules, generate a sample set based on the labelled target area and the attribute characteristics of all points of interest in each target area, and input the sample set into the random forest Perform training in the model to obtain an object recognition model;
  • Recognition module used to receive an object recognition request from a certain user, parse the attribute characteristics of the points of interest in the area to be recognized carried in the request, and input the attribute characteristics of the points of interest in the area to be recognized into the object recognition model To obtain the recognition result of the region to be recognized, and feed the recognition result back to the user.
  • the present application also provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor executes the computer program when the computer program is executed. The following steps:
  • Obtaining step Obtain the location information of the terminal devices of the preset user group, perform a data cleaning operation on the data of the location information, and filter the location information that belongs to the preset time period from the location information after the data cleaning is performed;
  • Clustering step perform a clustering operation on the location information within the preset time period based on a preset algorithm to obtain multiple location information clusters, set the multiple location information clusters into corresponding multiple target areas, and obtain them respectively Attribute characteristics of all points of interest in each target area;
  • Training step use preset labeling rules to label each of the target regions, generate a sample set based on the labeled target region and the attribute characteristics of all points of interest in each target region, and input the sample set into the random forest model Perform training to obtain an object recognition model;
  • Recognition step receiving an object recognition request from a certain user, analyzing and obtaining the attribute characteristics of the points of interest in the area to be recognized carried in the request, and inputting the attribute characteristics of the points of interest in the area to be recognized into the object recognition model to obtain The recognition result of the region to be recognized, and the recognition result is fed back to the user.
  • the present application also provides a computer-readable storage medium on which a computer program is stored, wherein the computer program is executed by a processor to implement the following steps:
  • Obtaining step Obtain the location information of the terminal devices of the preset user group, perform a data cleaning operation on the data of the location information, and filter the location information that belongs to the preset time period from the location information after the data cleaning is performed;
  • Clustering step perform a clustering operation on the location information within the preset time period based on a preset algorithm to obtain multiple location information clusters, set the multiple location information clusters into corresponding multiple target areas, and obtain them respectively Attribute characteristics of all points of interest in each target area;
  • Training step use preset labeling rules to label each of the target regions, generate a sample set based on the labeled target region and the attribute characteristics of all points of interest in each target region, and input the sample set into the random forest model Perform training to obtain an object recognition model;
  • Recognition step receiving an object recognition request from a certain user, analyzing and obtaining the attribute characteristics of the points of interest in the area to be recognized carried in the request, and inputting the attribute characteristics of the points of interest in the area to be recognized into the object recognition model to obtain The recognition result of the region to be recognized, and the recognition result is fed back to the user.
  • this application After performing data cleaning processing and clustering processing on the acquired location information, this application sets multiple clusters of location information obtained by clustering into corresponding target areas, obtains the attribute characteristics of all points of interest in each target area, and adds samples
  • the diversity of data, the attribute characteristics of all points of interest in the target area are used as a sample set to construct an object recognition model, and the attribute characteristics of the points of interest in the area to be recognized are input into the object recognition model to obtain the recognition result of the area to be recognized.
  • This application can improve the generalization ability of the object recognition model by processing sample data, thereby improving the accuracy of object recognition.
  • Figure 1 is an application environment diagram of a preferred embodiment of the computer equipment of this application
  • Fig. 2 is a schematic diagram of modules of an object recognition device based on big data
  • FIG. 3 is a schematic flowchart of a preferred embodiment of an object recognition method based on big data in this application.
  • FIG. 1 it is a schematic diagram of a preferred embodiment of the computer device 1 of this application.
  • the computer device 1 includes, but is not limited to: a memory 11, a processor 12, a display 13, and a network interface 14.
  • the computer device 1 is connected to the network through the network interface 14 to obtain original data.
  • the network may be an intranet, the Internet, a global system of mobile communication (GSM), a wideband code division multiple access (WCDMA), or a 4G network. , 5G network, Bluetooth (Bluetooth), Wi-Fi, call network and other wireless or wired networks.
  • the memory 11 includes at least one type of readable storage medium
  • the readable storage medium includes flash memory, hard disk, multimedia card, card type memory (for example, SD or DX memory, etc.), random access memory (RAM), static memory Random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, etc.
  • the memory 11 may be an internal storage unit of the computer device 1, for example, a hard disk or a memory of the computer device 1.
  • the memory 11 may also be an external storage device of the computer device 1, such as a plug-in hard disk, a smart media card (SMC), and a secure digital ( Secure Digital, SD card, Flash Card, etc.
  • the memory 11 may also include both the internal storage unit of the computer device 1 and its external storage device.
  • the memory 11 is generally used to store the operating system and various application software installed in the computer device 1, such as the program code of the object recognition program 10 based on big data.
  • the memory 11 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 12 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips.
  • the processor 12 is generally used to control the overall operation of the computer device 1, such as performing data interaction or communication-related control and processing.
  • the processor 12 is configured to run the program code or process data stored in the memory 11, for example, run the program code of the object recognition program 10 based on big data.
  • the display 13 may be referred to as a display screen or a display unit.
  • the display 13 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an organic light-emitting diode (OLED) touch device, and the like.
  • the display 13 is used for displaying the information processed in the computer device 1 and for displaying a visualized work interface, for example, displaying the results of data statistics.
  • the network interface 14 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the network interface 14 is usually used to establish a communication connection between the computer device 1 and other electronic devices.
  • Figure 1 only shows a computer device 1 with components 11-14 and an object recognition program 10 based on big data, but it should be understood that it is not required to implement all the illustrated components, and more or less may be implemented instead. s component.
  • the computer device 1 may also include a user interface.
  • the user interface may include a display (Display) and an input unit such as a keyboard (Keyboard).
  • the optional user interface may also include a standard wired interface and a wireless interface.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an organic light-emitting diode (OLED) touch device, and the like.
  • the display can also be called a display screen or a display unit as appropriate, and is used to display the information processed in the computer device 1 and to display a visualized user interface.
  • the computer device 1 may also include a radio frequency (RF) circuit, a sensor, an audio circuit, etc., which will not be repeated here.
  • RF radio frequency
  • Obtaining step Obtain the location information of the terminal devices of the preset user group, perform a data cleaning operation on the data of the location information, and filter the location information that belongs to the preset time period from the location information after the data cleaning is performed;
  • Clustering step perform a clustering operation on the location information within the preset time period based on a preset algorithm to obtain multiple location information clusters, set the multiple location information clusters into corresponding multiple target areas, and obtain them respectively Attribute characteristics of all points of interest in each target area;
  • Training step use preset labeling rules to label each of the target regions, generate a sample set based on the labeled target region and the attribute characteristics of all points of interest in each target region, and input the sample set into the random forest model Perform training to obtain an object recognition model;
  • Recognition step receiving an object recognition request from a certain user, analyzing and obtaining the attribute characteristics of the points of interest in the area to be recognized carried in the request, and inputting the attribute characteristics of the points of interest in the area to be recognized into the object recognition model to obtain The recognition result of the region to be recognized, and the recognition result is fed back to the user.
  • the storage device may be the memory 11 of the computer device 1 or other storage devices that are communicatively connected with the computer device 1.
  • the device 100 for object recognition based on big data described in this application can be installed in a computer device. According to the realized function.
  • the module described in the present invention can also be called a unit, which refers to a series of computer program segments that can be executed by a computer device processor and can complete fixed functions, and are stored in the memory of the computer device.
  • the device 100 for object recognition based on big data includes: an acquisition module 110, a clustering module 120, a training module 130, and a recognition module 140.
  • the obtaining module 110 is configured to obtain the location information of the terminal devices of the preset user group, perform a data cleaning operation on the data of the location information, and filter the location information belonging to the preset time period from the location information after the data cleaning is performed .
  • big data technology can be used to collect the location information of a large number of user groups' terminal devices (for example, mobile phones).
  • the location information can be a location-based service, namely LBS information.
  • LBS information is obtained by using various types of positioning technologies. Locate the current location of the terminal equipment, and provide information resources and basic services to the positioning terminal equipment through the mobile Internet.
  • the acquired location information data may contain duplicate information and missing information. Therefore, data cleaning can be performed on the acquired location information data.
  • the data mining technology is used to clear the missing data, abnormal data, and incorrect data in the acquisition process. From the cleaned location information, the location information belonging to the preset time period is filtered out. In this embodiment, the location information data that matches the time period (10:00-22:00) can be filtered out.
  • performing a data cleaning operation on the data of the location information of the terminal device includes:
  • the clustering module 120 is configured to perform a clustering operation on the location information within the preset time period based on a preset algorithm to obtain multiple location information clusters, and set the multiple location information clusters into corresponding multiple target areas , Respectively obtain the attribute characteristics of all points of interest in each target area.
  • the clustering operation is performed on the location information within the preset time period based on the DBSCAN algorithm.
  • the DBSCAN algorithm is a density-based clustering algorithm. The algorithm generally assumes that the category can be determined by the tightness of the sample distribution. . The samples of the same category are closely connected. That is to say, there must be samples of the same category not far from any sample of this category. By classifying closely connected samples into one category, a sample is obtained. Clustering categories, by dividing all closely connected samples into different categories, the final results of all clustering categories are obtained.
  • aggregating the obtained core LBS points, LBS points with reachable density, and edge LBS points into a location information cluster includes: obtaining LBS points with reachable density of the core LBS point, and using the iterative calculation to obtain the density reachable Update the cluster cluster corresponding to the core LBS point until the location information cluster of the core LBS point is obtained. It should be noted that there are sample points p and q for the sample set D. If q is in the neighborhood of p and p is the core sample point, then the sample point q is directly connected to the density of the sample point p.
  • a point of interest (POI) in a geographic information system can be a house, a shop, a mailbox, a bus stop, etc.
  • the training module 130 is used to label each of the target regions using preset labeling rules, generate a sample set based on the labeled target region and the attribute characteristics of all points of interest in each target region, and input the sample set into random Train in the forest model to get the object recognition model.
  • each target area is labeled using a preset labeling rule, and the target area of the business district is marked as 1, and each target area is marked as 1.
  • the target area that is not a business district is marked as 0.
  • the labeled target area is used as a dependent variable, and the attribute characteristics of all points of interest in each target area are used as independent variables to generate a sample set, and the sample set is input into a random forest model for training to obtain an object recognition model.
  • sampling with replacement is performed on the samples in each target area of the sample set, and several sub-data sets are constructed, and the attribute features are sampled with replacement in the several sub-data sets, that is, part of the attribute features and part of the observations are selected.
  • the establishment of sub-decision trees includes: the attribute feature selected for the split criterion each time is the feature that minimizes the information entropy of the decision tree at this node.
  • the pruning method can be used to prevent it from appearing. Overfitting.
  • the standard for cutting off branches is to prevent the error from increasing. The smaller the branch, the first to cut off, and the pruning stops when the preset minimum number of nodes is reached. Combine the prediction results of all decision trees to make a voting selection, and select a larger number of decision tree voting results as the final recognition result.
  • the recognition module 140 is configured to receive an object recognition request sent by a user, parse the attribute characteristics of the points of interest in the area to be recognized carried in the request, and input the attribute characteristics of the points of interest in the area to be recognized into the object recognition The model obtains the recognition result of the area to be recognized, and feeds back the recognition result to the user.
  • the solution is described by taking the object as a business district as an example.
  • Receive a business area identification request from a user and analyze the request to obtain the attribute characteristics of the points of interest in the area to be identified in the request (all types of points of interest and the number of points of interest in the area, for example, commercial, industrial, catering, public Businesses, government agencies, average consumption of points of interest, people flow of points of interest in different time periods, etc.)
  • the area is the probability value of each classification result, and the recognition result is fed back to the user.
  • this application also provides an object recognition method based on big data.
  • FIG. 3 this is a schematic diagram of a method flow of an embodiment of an object recognition method based on big data of this application.
  • the processor 12 of the computer device 1 executes the big data-based object recognition program 10 stored in the memory 11 to implement the following steps of the big data-based object recognition method:
  • Step S10 Obtain the location information of the terminal devices of the preset user group, perform a data cleaning operation on the data of the location information, and filter the location information belonging to the preset time period from the location information after the data cleaning is performed.
  • big data technology can be used to collect the location information of a large number of user groups' terminal devices (for example, mobile phones).
  • the location information can be a location-based service, namely LBS information.
  • LBS information is obtained by using various types of positioning technologies. Locate the current location of the terminal equipment, and provide information resources and basic services to the positioning terminal equipment through the mobile Internet.
  • the acquired location information data may contain duplicate information and exact information. Therefore, data cleaning can be performed on the acquired location information data.
  • Data mining technology can be used to remove missing data, abnormal data, and incorrect data during the acquisition process. From the cleaned location information, the location information belonging to the preset time period is filtered out. In this embodiment, the location information data that matches the time period (10:00-22:00) can be filtered out.
  • performing a data cleaning operation on the data of the location information of the terminal device includes:
  • Step S20 Perform a clustering operation on the location information within the preset time period based on a preset algorithm to obtain multiple location information clusters, set the multiple location information clusters into corresponding multiple target areas, and obtain each The attribute characteristics of all points of interest in the target area.
  • the clustering operation is performed on the location information within the preset time period based on the DBSCAN algorithm.
  • the DBSCAN algorithm is a density-based clustering algorithm. The algorithm generally assumes that the category can be determined by the tightness of the sample distribution. . The samples of the same category are closely connected. That is to say, there must be samples of the same category not far from any sample of this category. By classifying closely connected samples into one category, a sample is obtained. Clustering categories, by dividing all closely connected samples into different categories, the final results of all clustering categories are obtained.
  • aggregating the obtained core LBS points, LBS points with reachable density, and edge LBS points into a location information cluster includes: obtaining LBS points with reachable density of the core LBS point, and using the iterative calculation to obtain the density reachable Update the cluster cluster corresponding to the core LBS point until the location information cluster of the core LBS point is obtained. It should be noted that there are sample points p and q for the sample set D. If q is in the neighborhood of p and p is the core sample point, then the sample point q has a direct density from the sample point p.
  • a point of interest (POI) in a geographic information system can be a house, a shop, a mailbox, a bus stop, etc.
  • Step S30 Use preset labeling rules to label each of the target regions, generate a sample set based on the labeled target region and the attribute characteristics of all points of interest in each target region, and input the sample set into the random forest model Perform training to obtain an object recognition model.
  • each target area is labeled using preset labeling rules, and the sample where the target area is a business district is marked as 1, and the target area is marked as 1.
  • the target area that is not a business district is marked as 0.
  • the labeled target area is used as a dependent variable, and the attribute characteristics of all points of interest in each target area are used as independent variables to generate a sample set, and the sample set is input into a random forest model for training to obtain an object recognition model.
  • sampling with replacement is performed on the samples in each target area of the sample set, and several sub-data sets are constructed, and the attribute features are sampled with replacement in the several sub-data sets, that is, part of the attribute features and part of the observations are selected.
  • the establishment of sub-decision trees includes: the attribute feature selected for the split criterion each time is the feature that minimizes the information entropy of the decision tree at this node.
  • the pruning method can be used to prevent it from appearing. Overfitting.
  • the standard for cutting off branches is to prevent the error from increasing. The smaller the branch, the first to cut off, and the pruning stops when the preset minimum number of nodes is reached. Combine the prediction results of all decision trees to make a voting selection, and select a larger number of decision tree voting results as the final recognition result.
  • Step S40 Receive an object recognition request sent by a certain user, parse the attribute characteristics of the points of interest in the area to be recognized carried in the request, and input the attribute characteristics of the points of interest in the area to be recognized into the object recognition model to obtain The recognition result of the region to be recognized, and the recognition result is fed back to the user.
  • the solution is described by taking the object as a business district as an example.
  • Receive a business area identification request from a user and analyze the request to obtain the attribute characteristics of the points of interest in the area to be identified in the request (all types of points of interest and the number of points of interest in the area, for example, commercial, industrial, catering, public Businesses, government agencies, average consumption of points of interest, people flow of points of interest in different time periods, etc.)
  • the area is the probability value of each classification result, and the recognition result is fed back to the user.
  • the embodiment of the present application also proposes a computer-readable storage medium.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the computer-readable storage medium may be a hard disk, a multimedia card, or an SD card. Any one or several of card, flash memory card, SMC, read only memory (ROM), erasable programmable read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, etc. Any combination of species.
  • the computer-readable storage medium includes an object recognition program 10 based on big data, and when the object recognition program 10 based on big data is executed by a processor, the following operations are implemented:
  • Obtaining step Obtain the location information of the terminal devices of the preset user group, perform a data cleaning operation on the data of the location information, and filter the location information that belongs to the preset time period from the location information after the data cleaning is performed;
  • Clustering step perform a clustering operation on the location information within the preset time period based on a preset algorithm to obtain multiple location information clusters, set the multiple location information clusters into corresponding multiple target areas, and obtain them respectively Attribute characteristics of all points of interest in each target area;
  • Training step use preset labeling rules to label each of the target regions, generate a sample set based on the labeled target region and the attribute characteristics of all points of interest in each target region, and input the sample set into the random forest model Perform training to obtain an object recognition model;
  • Recognition step receiving an object recognition request from a certain user, analyzing and obtaining the attribute characteristics of the points of interest in the area to be recognized carried in the request, and inputting the attribute characteristics of the points of interest in the area to be recognized into the object recognition model to obtain The recognition result of the region to be recognized, and the recognition result is fed back to the user.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种基于大数据的对象识别方法、装置、设备及存储介质,涉及人工智能技术。该方法通过获取预设用户群的终端设备的位置信息后执行数据清洗操作,再筛选出属于预设时间段内的位置信息(S10),对预设时间段内的位置信息执行聚类操作得到多个位置信息簇,将多个位置信息簇设置成对应的多个目标区域,分别获取各目标区域内的所有兴趣点的属性特征(S20),对各所述目标区域进行标注生成样本集,将样本集输入随机森林模型中训练得到对象识别模型(S30),接收用户发出的对象识别的请求,将待识别区域兴趣点的属性特征输入对象识别模型,得到待识别区域的识别结果(S40)。通过对样本数据的处理,可以提高模型的泛化能力,从而提升对象识别的精准性。

Description

基于大数据的对象识别方法、装置、设备及存储介质
本申请要求于2020年1月2日提交中国专利局、申请号为CN202010002168.1,发明名称为“基于大数据的对象识别方法、电子装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种基于大数据的对象识别方法、装置、设备及存储介质。
背景技术
现有技术中,发明人发现对象识别通常是根据对象的单一类型数据进行建模,利用模型对待识别的对象进行识别,例如,在传统的商圈识别方法中,通常是采集商圈的单一类型数据(例如,商圈的人流量)作为样本数据进行建模,以此对待识别的商圈进行识别。该方法因样本数据类型单一且缺乏对样本数据的处理,导致对象识别的准确率低。
发明内容
鉴于以上内容,本申请提供一种基于大数据的对象识别方法、装置、设备及存储介质,其目的在于现有技术中因缺乏对样本数据的处理,导致的对象识别的准确率较低的问题。
为实现上述目的,本申请提供一种基于大数据的对象识别方法,该方法包括:
获取步骤:获取预设用户群的终端设备的位置信息,对所述位置信息的数据执行数据清洗操作,从执行数据清洗后的位置信息中筛选出属于预设时间段内的位置信息;
聚类步骤:基于预设算法对所述预设时间段内的位置信息执行聚类操作,得到多个位置信息簇,将所述多个位置信息簇设置成对应的多个目标区域,分别获取各目标区域内的所有兴趣点的属性特征;
训练步骤:利用预设的标注规则对各所述目标区域进行标注,基于标注后的目标区域及各目标区域内的所有兴趣点的属性特征生成样本集,将所述样本集输入随机森林模型中进行训练,得到对象识别模型;及
识别步骤:接收某个用户发出的对象识别请求,解析得到所述请求携带的待识别区域的兴趣点的属性特征,将所述待识别区域的兴趣点的属性特征输入所述对象识别模型,得到所述待识别区域的识别结果,并将所述识别结果反馈至所述用户。
为了实现上述目的,本申请还提供一种基于大数据的对象识别装置,所述装置包括:
获取模块:用于获取预设用户群的终端设备的位置信息,对所述位置信息的数据执行数据清洗操作,从执行数据清洗后的位置信息中筛选出属于预设时间段内的位置信息;
聚类模块:用于基于预设算法对所述预设时间段内的位置信息执行聚类操作,得到多个位置信息簇,将所述多个位置信息簇设置成对应的多个目标区域,分别获取各目标区域内的所有兴趣点的属性特征;
训练模块:用于利用预设的标注规则对各所述目标区域进行标注,基于标注后的目标区域及各目标区域内的所有兴趣点的属性特征生成样本集,将所述样本集输入随机森林模型中进行训练,得到对象识别模型;及
识别模块:用于接收某个用户发出的对象识别请求,解析得到所述请求携带的待识别区域的兴趣点的属性特征,将所述待识别区域的兴趣点的属性特征输入所述对象识别模型,得到所述待识别区域的识别结果,并将所述识别结果反馈至所述用户。
为实现上述目的,本申请还提供一种计算机设备,包括存储器、处理器以及存储在所 述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如下步骤:
获取步骤:获取预设用户群的终端设备的位置信息,对所述位置信息的数据执行数据清洗操作,从执行数据清洗后的位置信息中筛选出属于预设时间段内的位置信息;
聚类步骤:基于预设算法对所述预设时间段内的位置信息执行聚类操作,得到多个位置信息簇,将所述多个位置信息簇设置成对应的多个目标区域,分别获取各目标区域内的所有兴趣点的属性特征;
训练步骤:利用预设的标注规则对各所述目标区域进行标注,基于标注后的目标区域及各目标区域内的所有兴趣点的属性特征生成样本集,将所述样本集输入随机森林模型中进行训练,得到对象识别模型;及
识别步骤:接收某个用户发出的对象识别请求,解析得到所述请求携带的待识别区域的兴趣点的属性特征,将所述待识别区域的兴趣点的属性特征输入所述对象识别模型,得到所述待识别区域的识别结果,并将所述识别结果反馈至所述用户。
为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下步骤:
获取步骤:获取预设用户群的终端设备的位置信息,对所述位置信息的数据执行数据清洗操作,从执行数据清洗后的位置信息中筛选出属于预设时间段内的位置信息;
聚类步骤:基于预设算法对所述预设时间段内的位置信息执行聚类操作,得到多个位置信息簇,将所述多个位置信息簇设置成对应的多个目标区域,分别获取各目标区域内的所有兴趣点的属性特征;
训练步骤:利用预设的标注规则对各所述目标区域进行标注,基于标注后的目标区域及各目标区域内的所有兴趣点的属性特征生成样本集,将所述样本集输入随机森林模型中进行训练,得到对象识别模型;及
识别步骤:接收某个用户发出的对象识别请求,解析得到所述请求携带的待识别区域的兴趣点的属性特征,将所述待识别区域的兴趣点的属性特征输入所述对象识别模型,得到所述待识别区域的识别结果,并将所述识别结果反馈至所述用户。
本申请通过对获取的位置信息执行数据清洗处理和聚类处理后,将聚类得到的多个位置信息簇设置成对应的目标区域,获取个目标区域的所有兴趣点的属性特征,增加了样本数据的多样性,将目标区域的所有兴趣点的属性特征作为样本集构建对象识别模型,将待识别区域兴趣点的属性特征输入对象识别模型,得到待识别区域的识别结果。本申请通过对样本数据的处理,可以提高对象识别模型的泛化能力,从而提升对象识别的精准性。
附图说明
图1为本申请计算机设备较佳实施例的应用环境图;
图2为基于大数据的对象识别装置的模块示意图;
图3为本申请基于大数据的对象识别方法较佳实施例的流程示意图。
本申请目的的实现、功能特点及优点将结合实施例,参附图做进一步说明。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
参照图1所示,为本申请计算机设备1较佳实施例的示意图。
该计算机设备1包括但不限于:存储器11、处理器12、显示器13及网络接口14。 所述计算机设备1通过网络接口14连接网络,获取原始数据。其中,所述网络可以是企业内部网(Intranet)、互联网(Internet)、全球移动通讯系统(Global System of Mobile communication,GSM)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、4G网络、5G网络、蓝牙(Bluetooth)、Wi-Fi、通话网络等无线或有线网络。
其中,存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,所述存储器11可以是所述计算机设备1的内部存储单元,例如该计算机设备1的硬盘或内存。在另一些实施例中,所述存储器11也可以是所述计算机设备1的外部存储设备,例如该计算机设备1配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,所述存储器11还可以既包括所述计算机设备1的内部存储单元也包括其外部存储设备。本实施例中,存储器11通常用于存储安装于所述计算机设备1的操作系统和各类应用软件,例如基于大数据的对象识别程序10的程序代码等。此外,存储器11还可以用于暂时地存储已经输出或者将要输出的各类数据。
处理器12在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器12通常用于控制所述计算机设备1的总体操作,例如执行数据交互或者通信相关的控制和处理等。本实施例中,所述处理器12用于运行所述存储器11中存储的程序代码或者处理数据,例如运行基于大数据的对象识别程序10的程序代码等。
显示器13可以称为显示屏或显示单元。在一些实施例中显示器13可以是LED显示器、液晶显示器、触控式液晶显示器以及有机发光二极管(Organic Light-Emitting Diode,OLED)触摸器等。显示器13用于显示在计算机设备1中处理的信息以及用于显示可视化的工作界面,例如显示数据统计的结果。
网络接口14可选地可以包括标准的有线接口、无线接口(如WI-FI接口),该网络接口14通常用于在所述计算机设备1与其它电子设备之间建立通信连接。
图1仅示出了具有组件11-14以及基于大数据的对象识别程序10的计算机设备1,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。
可选地,所述计算机设备1还可以包括用户接口,用户接口可以包括显示器(Display)、输入单元比如键盘(Keyboard),可选的用户接口还可以包括标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及有机发光二极管(Organic Light-Emitting Diode,OLED)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在计算机设备1中处理的信息以及用于显示可视化的用户界面。
该计算机设备1还可以包括射频(Radio Frequency,RF)电路、传感器和音频电路等等,在此不再赘述。
在上述实施例中,处理器12执行存储器11中存储的基于大数据的对象识别程序10时可以实现如下步骤:
获取步骤:获取预设用户群的终端设备的位置信息,对所述位置信息的数据执行数据清洗操作,从执行数据清洗后的位置信息中筛选出属于预设时间段内的位置信息;
聚类步骤:基于预设算法对所述预设时间段内的位置信息执行聚类操作,得到多个位置信息簇,将所述多个位置信息簇设置成对应的多个目标区域,分别获取各目标区域内的所有兴趣点的属性特征;
训练步骤:利用预设的标注规则对各所述目标区域进行标注,基于标注后的目标区域 及各目标区域内的所有兴趣点的属性特征生成样本集,将所述样本集输入随机森林模型中进行训练,得到对象识别模型;及
识别步骤:接收某个用户发出的对象识别请求,解析得到所述请求携带的待识别区域的兴趣点的属性特征,将所述待识别区域的兴趣点的属性特征输入所述对象识别模型,得到所述待识别区域的识别结果,并将所述识别结果反馈至所述用户。
所述存储设备可以为计算机设备1的存储器11,也可以为与计算机设备1通讯连接的其它存储设备。
关于上述步骤的详细介绍,请参照下述图2关于基于大数据的对象识别装置100的模块图以及图3关于基于大数据的对象识别方法实施例的流程图的说明。
本申请所述基于大数据的对象识别装置100可以安装于计算机设备中。根据实现的功能。本发所述模块也可以称之为单元,是指一种能够被计算机设备处理器所执行,并且能够完成固定功能的一系列计算机程序段,其存储在计算机设备的存储器中。
参照图2所示,为基于大数据的对象识别装置100一实施例的模块图。在本实施例中,所述基于大数据的对象识别装置100包括:获取模块110、聚类模块120、训练模块130及识别模块140。
获取模块110,用于获取预设用户群的终端设备的位置信息,对所述位置信息的数据执行数据清洗操作,从执行数据清洗后的位置信息中筛选出属于预设时间段内的位置信息。
在本实施例中,可以使用大数据技术采集大量用户群的终端设备(例如,手机)的位置信息,位置信息可以是基于位置的服务即LBS信息,LBS信息是利用各类型的定位技术来获取定位终端设备当前的所在位置,通过移动互联网向定位终端设备提供信息资源和基础服务。获取的位置信息数据可能存在重复的信息和缺失的信息,因此可以对获取的位置信息数据执行数据清洗,使用数据挖掘技术将获取过程中出现的缺失数据、异常数据、错误数据清除,从执行数据清洗后的位置信息中筛选出属于预设时间段内的位置信息,本实施例中,可以筛选出符合时间段(10:00-22:00)的位置信息数据。
在一个实施例中,对终端设备的位置信息的数据执行数据清洗操作包括:
选取信息完整的位置信息的数据作为清洗样本放入CART决策树的根部,并将清洗样本分为第一组数据和第二组数据,利用第一组数据建立决策树,并以该决策树内部每个节点信息作为分割依据,利用第二组数据修剪决策树,当决策树每个类只存在一个节点时,结束数据清洗。
聚类模块120,用于基于预设算法对所述预设时间段内的位置信息执行聚类操作,得到多个位置信息簇,将所述多个位置信息簇设置成对应的多个目标区域,分别获取各目标区域内的所有兴趣点的属性特征。
在本实施例中,基于DBSCAN算法对所述预设时间段内的位置信息执行聚类操作,DBSCAN算法是一种基于密度的聚类算法,该算法一般假定类别可以通过样本分布的紧密程度决定。同一类别的样本,他们之间是紧密相连的,也就是说,在该类别任意样本周围不远处一定有同类别的样本存在,通过将紧密相连的样本划为一类,这样就得到了一个聚类类别,通过将所有各组紧密相连的样本划为各个不同的类别,就得到最终的所有聚类类别结果。
首先设置各LBS点之间密度半径,及所述密度半径内最小的LBS点的数量(MinPts),基于所述密度半径及所述最小的LBS点的数量,从所有LBS点中迭代计算得到核心LBS点、密度可达的LBS点以及边缘LBS点,将得到的核心LBS点、密度可达的LBS点以及边缘LBS点聚集成位置信息簇。其中,将得到的核心LBS点、密度可达的LBS点以及边缘LBS点聚集成位置信息簇包括:获取所述核心LBS点的密度可达的LBS点,利用所述迭代计算得到的密度可达的LBS点,更新所述核心LBS点对应的聚类簇,直至获取到所述核心LBS点的位置信息簇。需要说明的是,对于样本集合D存在样本点p和q,如果q 在p的邻域内,且p为核心样本点,那么样本点q从样本点p密度直达。对于样本集合D,给定样本点p 1,p 2,...p n,p=p 1,q=p n,若样本点p i从p i-1密度直达,那么q从p密度可达。
具体地,A、初始化核心LBS点集合Ω=φ,初始化聚类的簇数k=0,初始化未访问样本集合Γ=D,簇划分C=φ;
B、对于预设时间段内的位置信息集D=(x 1,x 2,...x m),j=1,2,…m,通过距离度量方式(例如,欧式距离),找到样本x j的邻域子样本集N∈(x j),若子样本集样本个数满足|N∈(x j)|≥MinPts,将样本x j加入核心LBS点样本集合:Ω=Ω∪{x j};
C、如果核心LBS点集合Ω=φ,则算法结束,否则转入步骤D;
D、在核心LBS点集合Ω中,随机选择一个核心LBS点o,初始化当前簇核心LBS点队列Ω cur={o},初始化类别序号k=k+1,初始化当前簇样本集合C k={o},更新样本集合Γ=Γ-{o};
E、如果当前簇核心LBS点队列Ω cur=φ,则当前聚类簇C k生成完毕,更新簇划分C={C 1,C 21,...,C k},更新核心LBS点集合Ω=Ω-C k,转入步骤C,否则更新核心LBS点集合Ω=Ω-C k
F、在当前簇核心LBS点队列Ω cur中取出一个核心LBS点o′,通过邻域距离阈值∈找出所有的邻域子样本集N∈(o′),令Δ=N∈(o′)∩Γ,更新当前簇样本集合C k=C k∪Δ,更新未访问样本集合Γ=Γ-Δ,更新Ω cur=Ω cur∪(Δ∩Ω)-o′,转入步骤E,输出结果多个位置信息簇C={C 1,C 2,...C k}。
将多个位置信息簇设置成对应的目标区域,位置信息簇的边界设置成目标区域的边界,根据目标区域的边界分别获取各目标区域内的所有兴趣点的属性特征,目标区域的属性特征包括:目标区域所有的兴趣点类型和兴趣点数量(例如,商业、工业、餐饮、公共事业、政府机构等)、兴趣点平均消费金额、兴趣点在不同时间段人流量。兴趣点(Point of Interest,POI)在地理信息系统中,可以是一栋房子、一个商铺、一个邮筒、一个公交站等。
训练模块130,用于利用预设的标注规则对各所述目标区域进行标注,基于标注后的目标区域及各目标区域内的所有兴趣点的属性特征生成样本集,将所述样本集输入随机森林模型中进行训练,得到对象识别模型。
在本实施例中,获取各个目标区域内的所有兴趣点的属性特征后,利用预设的标注规则对各目标区域进行标注,将目标区域为商圈的目标区域标注为1,将各目标区域中不为商圈的目标区域标注为0。将标注后的目标区域作为因变量,各目标区域内的所有兴趣点的属性特征作为自变量生成样本集,将所述样本集输入随机森林模型中进行训练,得到对象识别模型。
进一步的,将所述样本集按预设比例分为训练集及验证集;
利用所述训练集的样本数据对随机森林模型进行训练,以确定模型的具体参数,利用所述验证集的样本数据来验证模型的准确率,当所述准确率达到预设阈值时结束训练,得到所述对象识别模型,当所述准确率未达到预设阈值时,继续增加样本数据对随机森林模型进行训练。
具体地,对样本集中各目标区域的样本进行有放回的抽样,构建出若干个子数据集,在若干个子数据集中对属性特征进行有放回的抽样,即选取部分属性特征和部分观测值进行子决策树的建立。其中,每个子决策树建立的过程包括:每次选取的用于分裂标准的属性特征都是使得决策树在这个节点时信息熵最小的特征,决策树建立完成后可以通过剪枝方法来防止出现过拟合。剪去分支的标准为防止误差增加,越小越先剪去,直到达到预设的最小节点数量时停止修剪。将所有的决策树预测结果结合起来,进行投票选择,选择数量较多的决策树投票结果作为最终识别结果。
识别模块140,用于接收某个用户发出的对象识别请求,解析得到所述请求携带的待识别区域的兴趣点的属性特征,将所述待识别区域的兴趣点的属性特征输入所述对象识别模型,得到所述待识别区域的识别结果,并将所述识别结果反馈至所述用户。
在本实施例中,以对象为商圈为例对本方案进行说明。接收某个用户发出的商圈识别的请求,解析请求获取请求中携带的待识别区域的兴趣点的属性特征(区域内所有的兴趣点类型和兴趣点数量,例如,商业、工业、餐饮、公共事业、政府机构、兴趣点平均消费金额、兴趣点在不同时间段人流量等),将待识别区域兴趣点的属性特征输入对象识别模型,得到待识别区域的识别结果,识别结果包括待识别的区域为每种分类结果的概率值,将识别结果反馈至用户。
此外,本申请还提供一种基于大数据的对象识别方法。参照图3所示,为本申请基于大数据的对象识别方法的实施例的方法流程示意图。计算机设备1的处理器12执行存储器11中存储的基于大数据的对象识别程序10时实现基于大数据的对象识别方法的如下步骤:
步骤S10:获取预设用户群的终端设备的位置信息,对所述位置信息的数据执行数据清洗操作,从执行数据清洗后的位置信息中筛选出属于预设时间段内的位置信息。
在本实施例中,可以使用大数据技术采集大量用户群的终端设备(例如,手机)的位置信息,位置信息可以是基于位置的服务即LBS信息,LBS信息是利用各类型的定位技术来获取定位终端设备当前的所在位置,通过移动互联网向定位终端设备提供信息资源和基础服务。获取的位置信息数据可能存在重复的信息和确实的信息,因此可以对获取的位置信息数据执行数据清洗,使用数据挖掘技术将获取过程中出现的缺失数据、异常数据、错误数据清除,从执行数据清洗后的位置信息中筛选出属于预设时间段内的位置信息,本实施例中,可以筛选出符合时间段(10:00-22:00)的位置信息数据。
在一个实施例中,对终端设备的位置信息的数据执行数据清洗操作包括:
选取信息完整的位置信息的数据作为清洗样本放入CART决策树的根部,并将清洗样本分为第一组数据和第二组数据,利用第一组数据建立决策树,并以该决策树内部每个节点信息作为分割依据,利用第二组数据修剪决策树,当决策树每个类只存在一个节点时,结束数据清洗。
步骤S20:基于预设算法对所述预设时间段内的位置信息执行聚类操作,得到多个位置信息簇,将所述多个位置信息簇设置成对应的多个目标区域,分别获取各目标区域内的所有兴趣点的属性特征。
在本实施例中,基于DBSCAN算法对所述预设时间段内的位置信息执行聚类操作,DBSCAN算法是一种基于密度的聚类算法,该算法一般假定类别可以通过样本分布的紧密程度决定。同一类别的样本,他们之间是紧密相连的,也就是说,在该类别任意样本周围不远处一定有同类别的样本存在,通过将紧密相连的样本划为一类,这样就得到了一个聚类类别,通过将所有各组紧密相连的样本划为各个不同的类别,就得到最终的所有聚类类别结果。
首先设置各LBS点之间密度半径,及所述密度半径内最小的LBS点的数量(MinPts),基于所述密度半径及所述最小的LBS点的数量,从所有LBS点中迭代计算得到核心LBS点、密度可达的LBS点以及边缘LBS点,将得到的核心LBS点、密度可达的LBS点以及边缘LBS点聚集成位置信息簇。其中,将得到的核心LBS点、密度可达的LBS点以及边缘LBS点聚集成位置信息簇包括:获取所述核心LBS点的密度可达的LBS点,利用所述迭代计算得到的密度可达的LBS点,更新所述核心LBS点对应的聚类簇,直至获取到所述核心LBS点的位置信息簇。需要说明的是,对于样本集合D存在样本点p和q,如果q在p的邻域内,且p为核心样本点,那么样本点q从样本点p密度直达。对于样本集合D,给定样本点p 1,p 2,...p n,p=p 1,q=p n,若样本点p i从p i-1密度直达,那么q从p密度可达。
具体地,A、初始化核心LBS点集合Ω=φ,初始化聚类的簇数k=0,初始化未访问样本集合Γ=D,簇划分C=φ;
B、对于预设时间段内的位置信息集D=(x 1,x 2,...x m),j=1,2,…m,通过距离度量方式(例如,欧式距离),找到样本x j的邻域子样本集N∈(x j),若子样本集样本个数满足|N∈(x j)|≥MinPts,将样本x j加入核心LBS点样本集合:Ω=Ω∪{x j};
C、如果核心LBS点集合Ω=φ,则算法结束,否则转入步骤D;
D、在核心LBS点集合Ω中,随机选择一个核心LBS点o,初始化当前簇核心LBS点队列Ω cur={o},初始化类别序号k=k+1,初始化当前簇样本集合C k={o},更新样本集合Γ=Γ-{o};
E、如果当前簇核心LBS点队列Ω cur=φ,则当前聚类簇C k生成完毕,更新簇划分C={C 1,C 21,...,C k},更新核心LBS点集合Ω=Ω-C k,转入步骤C,否则更新核心LBS点集合Ω=Ω-C k
F、在当前簇核心LBS点队列Ω cur中取出一个核心LBS点o′,通过邻域距离阈值∈找出所有的邻域子样本集N∈(o′),令Δ=N∈(o′)∩Γ,更新当前簇样本集合C k=C k∪Δ,更新未访问样本集合Γ=Γ-Δ,更新Ω cur=Ω cur∪(Δ∩Ω)-o′,转入步骤E,输出结果多个位置信息簇C={C 1,C 2,...C k}。
将多个位置信息簇设置成对应的目标区域,位置信息簇的边界设置成目标区域的边界,根据目标区域的边界分别获取各目标区域内的所有兴趣点的属性特征,目标区域的属性特征包括:目标区域所有的兴趣点类型和兴趣点数量(例如,商业、工业、餐饮、公共事业、政府机构等)、兴趣点平均消费金额、兴趣点在不同时间段人流量。兴趣点(Point of Interest,POI)在地理信息系统中,可以是一栋房子、一个商铺、一个邮筒、一个公交站等。
步骤S30:利用预设的标注规则对各所述目标区域进行标注,基于标注后的目标区域及各目标区域内的所有兴趣点的属性特征生成样本集,将所述样本集输入随机森林模型中进行训练,得到对象识别模型。
在本实施例中,获取各个目标区域内的所有兴趣点的属性特征后,利用预设的标注规则对各目标区域进行标注,将目标区域为商圈的样本标注为1,将各目标区域中不是商圈的目标区域标注为0。将标注后的目标区域作为因变量,各目标区域内的所有兴趣点的属性特征作为自变量生成样本集,将所述样本集输入随机森林模型中进行训练,得到对象识别模型。
进一步的,将所述样本集按预设比例分为训练集及验证集;
利用所述训练集的样本数据对随机森林模型进行训练,以确定模型的具体参数,利用所述验证集的样本数据来验证模型的准确率,当所述准确率达到预设阈值时结束训练,得到对象识别模型,当所述准确率未达到预设阈值时,继续增加样本数据对随机森林模型进行训练。
具体地,对样本集中各目标区域的样本进行有放回的抽样,构建出若干个子数据集,在若干个子数据集中对属性特征进行有放回的抽样,即选取部分属性特征和部分观测值进行子决策树的建立。其中,每个子决策树建立的过程包括:每次选取的用于分裂标准的属性特征都是使得决策树在这个节点时信息熵最小的特征,决策树建立完成后可以通过剪枝方法来防止出现过拟合。剪去分支的标准为防止误差增加,越小越先剪去,直到达到预设的最小节点数量时停止修剪。将所有的决策树预测结果结合起来,进行投票选择,选择数量较多的决策树投票结果作为最终识别结果。
步骤S40:接收某个用户发出的对象识别请求,解析得到所述请求携带的待识别区域的兴趣点的属性特征,将所述待识别区域的兴趣点的属性特征输入所述对象识别模型,得到所述待识别区域的识别结果,并将所述识别结果反馈至所述用户。
在本实施例中,以对象为商圈为例对本方案进行说明。接收某个用户发出的商圈识别的请求,解析请求获取请求中携带的待识别区域的兴趣点的属性特征(区域内所有的兴趣点类型和兴趣点数量,例如,商业、工业、餐饮、公共事业、政府机构、兴趣点平均消费金额、兴趣点在不同时间段人流量等),将待识别区域兴趣点的属性特征输入对象识别模型,得到待识别区域的识别结果,识别结果包括待识别的区域为每种分类结果的概率值,将识别结果反馈至用户。
此外,本申请实施例还提出一种计算机可读存储介质,所述计算机可读存储介质可以是非易失性,也可以是易失性,该计算机可读存储介质可以是硬盘、多媒体卡、SD卡、闪存卡、SMC、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储器(CD-ROM)、USB存储器等等中的任意一种或者几种的任意组合。所述计算机可读存储介质中包括基于大数据的对象识别程序10,所述基于大数据的对象识别程序10被处理器执行时实现如下操作:
获取步骤:获取预设用户群的终端设备的位置信息,对所述位置信息的数据执行数据清洗操作,从执行数据清洗后的位置信息中筛选出属于预设时间段内的位置信息;
聚类步骤:基于预设算法对所述预设时间段内的位置信息执行聚类操作,得到多个位置信息簇,将所述多个位置信息簇设置成对应的多个目标区域,分别获取各目标区域内的所有兴趣点的属性特征;
训练步骤:利用预设的标注规则对各所述目标区域进行标注,基于标注后的目标区域及各目标区域内的所有兴趣点的属性特征生成样本集,将所述样本集输入随机森林模型中进行训练,得到对象识别模型;及
识别步骤:接收某个用户发出的对象识别请求,解析得到所述请求携带的待识别区域的兴趣点的属性特征,将所述待识别区域的兴趣点的属性特征输入所述对象识别模型,得到所述待识别区域的识别结果,并将所述识别结果反馈至所述用户。
本申请之计算机可读存储介质的具体实施方式与上述基于大数据的对象识别方法的具体实施方式大致相同,在此不再赘述。
需要说明的是,上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。并且本文中的术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,电子装置,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种基于大数据的对象识别方法,应用于计算机设备,其中,所述方法包括:
    获取步骤:获取预设用户群的终端设备的位置信息,对所述位置信息的数据执行数据清洗操作,从执行数据清洗后的位置信息中筛选出属于预设时间段内的位置信息;
    聚类步骤:基于预设算法对所述预设时间段内的位置信息执行聚类操作,得到多个位置信息簇,将所述多个位置信息簇设置成对应的多个目标区域,分别获取各目标区域内的所有兴趣点的属性特征;
    训练步骤:利用预设的标注规则对各所述目标区域进行标注,基于标注后的目标区域及各目标区域内的所有兴趣点的属性特征生成样本集,将所述样本集输入随机森林模型中进行训练,得到对象识别模型;及
    识别步骤:接收某个用户发出的对象识别请求,解析得到所述请求携带的待识别区域的兴趣点的属性特征,将所述待识别区域的兴趣点的属性特征输入所述对象识别模型,得到所述待识别区域的识别结果,并将所述识别结果反馈至所述用户。
  2. 如权利要求1所述的基于大数据的对象识别方法,其中,所述预设时间段内的位置信息为LBS点,所述基于预设算法对所述预设时间段内的位置信息执行聚类操作包括:
    设置各LBS点之间的密度半径,及所述密度半径内最小的LBS点的数量,基于所述密度半径及所述最小的LBS点的数量,从所有LBS点中迭代计算得到核心LBS点、密度可达的LBS点以及边缘LBS点,将得到的核心LBS点、密度可达的LBS点以及边缘LBS点聚集成位置信息簇。
  3. 如权利要求2所述的基于大数据的对象识别方法,其中,所述将得到的核心LBS点、密度可达的LBS点以及边缘LBS点聚集成位置信息簇包括:
    获取所述核心LBS点的密度可达的LBS点,利用所述迭代计算得到的密度可达的LBS点,更新所述核心LBS点对应的聚类簇,直至获取到所述核心LBS点的位置信息簇。
  4. 如权利要求1所述的基于大数据的对象识别方法,其中,所述训练步骤包括:
    将所述样本集按预设比例分为训练集及验证集;
    利用所述训练集的样本数据对随机森林模型进行训练,以确定模型的具体参数;
    利用所述验证集的样本数据来验证模型的准确率,当所述准确率达到预设阈值时结束训练,得到所述对象识别模型,当所述准确率未达到预设阈值时,继续增加样本数据对随机森林模型进行训练。
  5. 如权利要求1所述的基于大数据的对象识别方法,其中,所述对所述位置信息的数据执行数据清洗操作包括:
    选取信息完整的终端设备的位置信息的数据作为清洗样本放入CART决策树的根部,并将所述清洗样本分为第一组数据和第二组数据;
    利用所述第一组数据建立决策树,并以该决策树内部每个节点信息作为分割依据;
    利用所述第二组数据修剪决策树,当决策树每个类只存在一个节点时,结束数据清洗。
  6. 如权利要求1所述的基于大数据的对象识别方法,其中,所述兴趣点的属性特征包括兴趣点类型和兴趣点数量。
  7. 如权利要求1所述的基于大数据的对象识别方法,其中,所述识别结果包括待识别的区域为每种分类结果的概率值。
  8. 一种基于大数据的对象识别装置,其中,所述装置包括:
    获取模块:用于获取预设用户群的终端设备的位置信息,对所述位置信息的数据执行数据清洗操作,从执行数据清洗后的位置信息中筛选出属于预设时间段内的位置信息;
    聚类模块:用于基于预设算法对所述预设时间段内的位置信息执行聚类操作,得到多个位置信息簇,将所述多个位置信息簇设置成对应的多个目标区域,分别获取各目标区域 内的所有兴趣点的属性特征;
    训练模块:用于利用预设的标注规则对各所述目标区域进行标注,基于标注后的目标区域及各目标区域内的所有兴趣点的属性特征生成样本集,将所述样本集输入随机森林模型中进行训练,得到对象识别模型;及
    识别模块:用于接收某个用户发出的对象识别请求,解析得到所述请求携带的待识别区域的兴趣点的属性特征,将所述待识别区域的兴趣点的属性特征输入所述对象识别模型,得到所述待识别区域的识别结果,并将所述识别结果反馈至所述用户。
  9. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如下步骤:
    获取步骤:获取预设用户群的终端设备的位置信息,对所述位置信息的数据执行数据清洗操作,从执行数据清洗后的位置信息中筛选出属于预设时间段内的位置信息;
    聚类步骤:基于预设算法对所述预设时间段内的位置信息执行聚类操作,得到多个位置信息簇,将所述多个位置信息簇设置成对应的多个目标区域,分别获取各目标区域内的所有兴趣点的属性特征;
    训练步骤:利用预设的标注规则对各所述目标区域进行标注,基于标注后的目标区域及各目标区域内的所有兴趣点的属性特征生成样本集,将所述样本集输入随机森林模型中进行训练,得到对象识别模型;及
    识别步骤:接收某个用户发出的对象识别请求,解析得到所述请求携带的待识别区域的兴趣点的属性特征,将所述待识别区域的兴趣点的属性特征输入所述对象识别模型,得到所述待识别区域的识别结果,并将所述识别结果反馈至所述用户。
  10. 如权利要求9所述的计算机设备,其中,所述预设时间段内的位置信息为LBS点,所述基于预设算法对所述预设时间段内的位置信息执行聚类操作包括:
    设置各LBS点之间的密度半径,及所述密度半径内最小的LBS点的数量,基于所述密度半径及所述最小的LBS点的数量,从所有LBS点中迭代计算得到核心LBS点、密度可达的LBS点以及边缘LBS点,将得到的核心LBS点、密度可达的LBS点以及边缘LBS点聚集成位置信息簇。
  11. 如权利要求10所述的计算机设备,其中,所述将得到的核心LBS点、密度可达的LBS点以及边缘LBS点聚集成位置信息簇包括:
    获取所述核心LBS点的密度可达的LBS点,利用所述迭代计算得到的密度可达的LBS点,更新所述核心LBS点对应的聚类簇,直至获取到所述核心LBS点的位置信息簇。
  12. 如权利要求9所述的计算机设备,其中,所述训练步骤包括:
    将所述样本集按预设比例分为训练集及验证集;
    利用所述训练集的样本数据对随机森林模型进行训练,以确定模型的具体参数;
    利用所述验证集的样本数据来验证模型的准确率,当所述准确率达到预设阈值时结束训练,得到所述对象识别模型,当所述准确率未达到预设阈值时,继续增加样本数据对随机森林模型进行训练。
  13. 如权利要求9所述的计算机设备,其中,所述对所述位置信息的数据执行数据清洗操作包括:
    选取信息完整的终端设备的位置信息的数据作为清洗样本放入CART决策树的根部,并将所述清洗样本分为第一组数据和第二组数据;
    利用所述第一组数据建立决策树,并以该决策树内部每个节点信息作为分割依据;
    利用所述第二组数据修剪决策树,当决策树每个类只存在一个节点时,结束数据清洗。
  14. 如权利要求9所述的计算机设备,其中,所述兴趣点的属性特征包括兴趣点类型和兴趣点数量。
  15. 如权利要求9所述的计算机设备,其中,所述识别结果包括待识别的区域为每种 分类结果的概率值。
  16. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下步骤:
    获取步骤:获取预设用户群的终端设备的位置信息,对所述位置信息的数据执行数据清洗操作,从执行数据清洗后的位置信息中筛选出属于预设时间段内的位置信息;
    聚类步骤:基于预设算法对所述预设时间段内的位置信息执行聚类操作,得到多个位置信息簇,将所述多个位置信息簇设置成对应的多个目标区域,分别获取各目标区域内的所有兴趣点的属性特征;
    训练步骤:利用预设的标注规则对各所述目标区域进行标注,基于标注后的目标区域及各目标区域内的所有兴趣点的属性特征生成样本集,将所述样本集输入随机森林模型中进行训练,得到对象识别模型;及
    识别步骤:接收某个用户发出的对象识别请求,解析得到所述请求携带的待识别区域的兴趣点的属性特征,将所述待识别区域的兴趣点的属性特征输入所述对象识别模型,得到所述待识别区域的识别结果,并将所述识别结果反馈至所述用户。
  17. 如权利要求16所述的计算机可读存储介质,其中,所述预设时间段内的位置信息为LBS点,所述基于预设算法对所述预设时间段内的位置信息执行聚类操作包括:
    设置各LBS点之间的密度半径,及所述密度半径内最小的LBS点的数量,基于所述密度半径及所述最小的LBS点的数量,从所有LBS点中迭代计算得到核心LBS点、密度可达的LBS点以及边缘LBS点,将得到的核心LBS点、密度可达的LBS点以及边缘LBS点聚集成位置信息簇。
  18. 如权利要求17所述的计算机可读存储介质,其中,所述将得到的核心LBS点、密度可达的LBS点以及边缘LBS点聚集成位置信息簇包括:
    获取所述核心LBS点的密度可达的LBS点,利用所述迭代计算得到的密度可达的LBS点,更新所述核心LBS点对应的聚类簇,直至获取到所述核心LBS点的位置信息簇。
  19. 如权利要求16所述的计算机可读存储介质,其中,所述训练步骤包括:
    将所述样本集按预设比例分为训练集及验证集;
    利用所述训练集的样本数据对随机森林模型进行训练,以确定模型的具体参数;
    利用所述验证集的样本数据来验证模型的准确率,当所述准确率达到预设阈值时结束训练,得到所述对象识别模型,当所述准确率未达到预设阈值时,继续增加样本数据对随机森林模型进行训练。
  20. 如权利要求16所述的计算机可读存储介质,其中,所述对所述位置信息的数据执行数据清洗操作包括:
    选取信息完整的终端设备的位置信息的数据作为清洗样本放入CART决策树的根部,并将所述清洗样本分为第一组数据和第二组数据;
    利用所述第一组数据建立决策树,并以该决策树内部每个节点信息作为分割依据;
    利用所述第二组数据修剪决策树,当决策树每个类只存在一个节点时,结束数据清洗。
PCT/CN2020/098978 2020-01-02 2020-06-29 基于大数据的对象识别方法、装置、设备及存储介质 WO2021135105A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010002168.1 2020-01-02
CN202010002168.1A CN111210269B (zh) 2020-01-02 2020-01-02 基于大数据的对象识别方法、电子装置及存储介质

Publications (1)

Publication Number Publication Date
WO2021135105A1 true WO2021135105A1 (zh) 2021-07-08

Family

ID=70789576

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/098978 WO2021135105A1 (zh) 2020-01-02 2020-06-29 基于大数据的对象识别方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN111210269B (zh)
WO (1) WO2021135105A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114397244A (zh) * 2022-01-14 2022-04-26 长春工业大学 一种金属增材制造制件缺陷的识别方法及相关设备
CN115022965A (zh) * 2022-07-25 2022-09-06 中国联合网络通信集团有限公司 小区定位方法、装置、电子设备及存储介质
CN115134407A (zh) * 2022-06-27 2022-09-30 平安银行股份有限公司 活跃区域确定方法、装置、计算机设备及存储介质
CN116827899A (zh) * 2023-08-30 2023-09-29 湖南于一科技有限公司 一种基于互联网工具app的对象添加方法及装置
CN117251650A (zh) * 2023-11-20 2023-12-19 之江实验室 地理热点中心识别方法、装置、计算机设备和存储介质

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111210269B (zh) * 2020-01-02 2020-09-18 平安科技(深圳)有限公司 基于大数据的对象识别方法、电子装置及存储介质
CN111612100B (zh) * 2020-06-04 2023-11-03 商汤集团有限公司 对象再识别方法、装置、存储介质及计算机设备
CN111860575B (zh) * 2020-06-05 2023-06-16 百度在线网络技术(北京)有限公司 物品属性信息的处理方法、装置、电子设备和存储介质
CN111510752B (zh) * 2020-06-18 2021-04-23 平安国际智慧城市科技股份有限公司 数据传输方法、装置、服务器及存储介质
CN112052848B (zh) * 2020-08-24 2022-09-20 腾讯科技(深圳)有限公司 街区标注中样本数据的获取方法及装置
CN112016326A (zh) * 2020-09-25 2020-12-01 北京百度网讯科技有限公司 一种地图区域词识别方法、装置、电子设备和存储介质
CN112294197A (zh) * 2020-11-04 2021-02-02 深圳市普森斯科技有限公司 扫地机的清扫控制方法、电子装置及存储介质
CN112364135B (zh) * 2020-12-03 2023-11-07 中国平安财产保险股份有限公司 基于多源数据的对象推送方法、装置、设备及存储介质
CN112380316B (zh) * 2020-12-09 2022-03-22 浙江浙蕨科技有限公司 一种出行情况数据处理方法及存储介质
CN113051490A (zh) * 2021-04-19 2021-06-29 北京百度网讯科技有限公司 新增兴趣点预测模型训练、新增兴趣点预测方法及装置
CN115438138B (zh) * 2022-11-09 2023-04-07 北京市城市规划设计研究院 就业中心识别方法、装置、电子设备及存储介质
CN115938031A (zh) * 2022-12-02 2023-04-07 深圳市鼎山科技有限公司 一种基于大数据的数据识别管理系统及方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150058088A1 (en) * 2013-08-22 2015-02-26 Mastercard International Incorporated Method and system for using transaction data to assign a trade area to a merchant location
CN106649331A (zh) * 2015-10-29 2017-05-10 阿里巴巴集团控股有限公司 商圈识别方法及设备
CN108596648A (zh) * 2018-03-20 2018-09-28 阿里巴巴集团控股有限公司 一种商圈判定方法和装置
CN109189917A (zh) * 2018-06-27 2019-01-11 华南师范大学 一种融合景观和社会特征的城市功能区划分方法及系统
CN110619090A (zh) * 2019-08-05 2019-12-27 香港理工大学深圳研究院 一种区域吸引力评估方法及设备
CN111210269A (zh) * 2020-01-02 2020-05-29 平安科技(深圳)有限公司 基于大数据的对象识别方法、电子装置及存储介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10402882B2 (en) * 2014-08-05 2019-09-03 Mastercard International Incorporated Method and system for integration of merchant trade areas into search results
WO2017198749A1 (en) * 2016-05-19 2017-11-23 Visiana Aps Image processing apparatus and method
CN107862347A (zh) * 2017-12-04 2018-03-30 国网山东省电力公司济南供电公司 一种基于随机森林的窃电行为的发现方法
CN109684563A (zh) * 2018-11-19 2019-04-26 银联智惠信息服务(上海)有限公司 商圈识别方法、装置以及计算机存储介质
CN109685573A (zh) * 2018-12-25 2019-04-26 拉扎斯网络科技(上海)有限公司 一种商圈数据的处理方法、装置、电子设备和存储介质
CN110210973A (zh) * 2019-05-31 2019-09-06 三峡大学 基于随机森林与朴素贝叶斯模型的内幕交易识别方法
CN110597943B (zh) * 2019-09-16 2022-04-01 腾讯科技(深圳)有限公司 基于人工智能的兴趣点处理方法、装置及电子设备
CN110634028B (zh) * 2019-09-18 2022-08-19 创优数字科技(广东)有限公司 一种商品结构配置方法及系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150058088A1 (en) * 2013-08-22 2015-02-26 Mastercard International Incorporated Method and system for using transaction data to assign a trade area to a merchant location
CN106649331A (zh) * 2015-10-29 2017-05-10 阿里巴巴集团控股有限公司 商圈识别方法及设备
CN108596648A (zh) * 2018-03-20 2018-09-28 阿里巴巴集团控股有限公司 一种商圈判定方法和装置
CN109189917A (zh) * 2018-06-27 2019-01-11 华南师范大学 一种融合景观和社会特征的城市功能区划分方法及系统
CN110619090A (zh) * 2019-08-05 2019-12-27 香港理工大学深圳研究院 一种区域吸引力评估方法及设备
CN111210269A (zh) * 2020-01-02 2020-05-29 平安科技(深圳)有限公司 基于大数据的对象识别方法、电子装置及存储介质

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114397244A (zh) * 2022-01-14 2022-04-26 长春工业大学 一种金属增材制造制件缺陷的识别方法及相关设备
CN115134407A (zh) * 2022-06-27 2022-09-30 平安银行股份有限公司 活跃区域确定方法、装置、计算机设备及存储介质
CN115134407B (zh) * 2022-06-27 2024-04-26 平安银行股份有限公司 活跃区域确定方法、装置、计算机设备及存储介质
CN115022965A (zh) * 2022-07-25 2022-09-06 中国联合网络通信集团有限公司 小区定位方法、装置、电子设备及存储介质
CN115022965B (zh) * 2022-07-25 2024-04-09 中国联合网络通信集团有限公司 小区定位方法、装置、电子设备及存储介质
CN116827899A (zh) * 2023-08-30 2023-09-29 湖南于一科技有限公司 一种基于互联网工具app的对象添加方法及装置
CN116827899B (zh) * 2023-08-30 2023-12-01 湖南于一科技有限公司 一种基于互联网工具app的对象添加方法及装置
CN117251650A (zh) * 2023-11-20 2023-12-19 之江实验室 地理热点中心识别方法、装置、计算机设备和存储介质
CN117251650B (zh) * 2023-11-20 2024-02-06 之江实验室 地理热点中心识别方法、装置、计算机设备和存储介质

Also Published As

Publication number Publication date
CN111210269A (zh) 2020-05-29
CN111210269B (zh) 2020-09-18

Similar Documents

Publication Publication Date Title
WO2021135105A1 (zh) 基于大数据的对象识别方法、装置、设备及存储介质
CN107547633B (zh) 一种用户常驻点的处理方法、装置和存储介质
CN109697456B (zh) 业务分析方法、装置、设备及存储介质
CN110019616B (zh) 一种poi现势状态获取方法及其设备、存储介质、服务器
EP3165984A1 (en) An event analysis apparatus, an event analysis method, and an event analysis program
CN110046889B (zh) 一种异常行为主体的检测方法、装置及服务器
CN106843941B (zh) 信息处理方法、装置和计算机设备
US20150188879A1 (en) Apparatus for grouping servers, a method for grouping servers and a recording medium
WO2015154484A1 (zh) 流量数据分类方法及装置
CN110674360B (zh) 一种用于数据的溯源方法和系统
CN112118551A (zh) 设备风险识别方法及相关设备
US20140337274A1 (en) System and method for analyzing big data in a network environment
CN115345390B (zh) 一种行为轨迹预测方法、装置、电子设备及存储介质
US9706005B2 (en) Providing automatable units for infrastructure support
CN110688434B (zh) 一种兴趣点处理方法、装置、设备和介质
CN113626241A (zh) 应用程序的异常处理方法、装置、设备及存储介质
CN111078512A (zh) 告警记录生成方法、装置、告警设备及存储介质
CN110807050B (zh) 性能分析方法、装置、计算机设备及存储介质
CN112016855A (zh) 基于关系网匹配的用户行业识别方法、装置和电子设备
CN110674290B (zh) 一种用于重叠社区发现的关系预测方法、装置和存储介质
CN106572486A (zh) 一种基于机器学习的手持终端流量识别方法和系统
CN112347100B (zh) 数据库索引优化方法、装置、计算机设备和存储介质
CN109064342A (zh) 客户身份识别方法及装置
CN116881430A (zh) 一种产业链识别方法、装置、电子设备及可读存储介质
CN112052248A (zh) 一种审计大数据处理方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20911229

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20911229

Country of ref document: EP

Kind code of ref document: A1