WO2022134480A1 - 建立风险预测模型的方法、区域风险预测方法及对应装置 - Google Patents

建立风险预测模型的方法、区域风险预测方法及对应装置 Download PDF

Info

Publication number
WO2022134480A1
WO2022134480A1 PCT/CN2021/097958 CN2021097958W WO2022134480A1 WO 2022134480 A1 WO2022134480 A1 WO 2022134480A1 CN 2021097958 W CN2021097958 W CN 2021097958W WO 2022134480 A1 WO2022134480 A1 WO 2022134480A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
sample
region
risk
distribution
Prior art date
Application number
PCT/CN2021/097958
Other languages
English (en)
French (fr)
Inventor
黄际洲
周景博
卓安
刘吉
熊昊一
窦德景
王海峰
Original Assignee
北京百度网讯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京百度网讯科技有限公司 filed Critical 北京百度网讯科技有限公司
Priority to JP2021576944A priority Critical patent/JP2023510665A/ja
Priority to EP21820072.3A priority patent/EP4040353B1/en
Priority to KR1020217042672A priority patent/KR20220093046A/ko
Priority to US17/620,820 priority patent/US20220398465A1/en
Publication of WO2022134480A1 publication Critical patent/WO2022134480A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • G06Q50/265Personal security, identity or safety
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/90Services for handling of emergency or hazardous situations, e.g. earthquake and tsunami warning systems [ETWS]

Definitions

  • the present disclosure relates to the field of computer application technology, and in particular to big data technology in the field of artificial intelligence technology.
  • the present disclosure provides a method for establishing a risk prediction model, a regional risk prediction method and a corresponding device, so as to facilitate the realization of regional risk prediction.
  • a method for establishing a risk prediction model comprising:
  • the training data includes a sample area set and a labeling result of the risk level of each sample area in the sample area set and the risk level of the region to which each sample area belongs;
  • the coding network uses the regional features of the sample regions to encode the feature representation of each sample region; the discrimination network identifies the risk level of the region to which the sample region belongs according to the feature representation of the sample region; the classification network uses the feature representation of the sample region to identify the risk level of the region; Represents the risk level of identifying the sample area; the training objectives of the initial model include: minimizing the identification difference of the discriminant network for the sample areas belonging to areas with different risk levels, and minimizing the classification network The identification results and labels of the sample areas difference in results.
  • a regional risk prediction method comprising:
  • the risk prediction model is pre-established by the method described above.
  • an apparatus for establishing a risk prediction model comprising:
  • a data acquisition unit configured to acquire training data, where the training data includes a sample area set and a labeling result of the risk level of each sample area in the sample area set and the risk level of the region to which each sample area belongs;
  • a model training unit used for training an initial model including an encoding network, a discriminating network and a classification network using the training data, and using the encoding network and the classification network in the initial model to obtain the risk prediction model after training;
  • the coding network uses the regional features of the sample regions to encode the feature representation of each sample region; the discrimination network identifies the risk level of the region to which the sample region belongs according to the feature representation of the sample region; the classification network uses the feature representation of the sample region to identify the risk level of the region; Represents the risk level of identifying the sample area; the training objectives of the initial model include: minimizing the identification difference of the discriminant network for the sample areas belonging to areas with different risk levels, and minimizing the classification network The identification results and labels of the sample areas difference in results
  • a regional risk prediction device comprising:
  • a feature extraction unit for extracting regional features of the target cell
  • a risk prediction unit configured to input the region characteristics into a risk prediction model, and determine the risk level of the target region according to the output result of the risk prediction model;
  • the risk prediction model is pre-established by the device as described above.
  • an electronic device comprising:
  • the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described above.
  • a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to perform the method as described above.
  • a computer program product comprising a computer program which, when executed by a processor, implements the method as described above.
  • the present disclosure provides a method for establishing a risk prediction model, based on which the established risk prediction model can realize risk prediction for the target area, thereby effectively preventing the spread of event hazards, and taking targeted measures. preventive measures.
  • FIG. 1 is a flowchart of a method for establishing a risk prediction model provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic structural diagram of a training initial model provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic structural diagram of a risk prediction model provided by an embodiment of the present disclosure.
  • FIG. 4 provides a regional risk prediction method according to an embodiment of the present disclosure
  • FIG. 5 is a structural diagram of an apparatus for establishing a risk prediction model provided by the present disclosure
  • FIG. 6 is a structural diagram of a regional risk prediction device provided by the present disclosure.
  • FIG. 7 is a block diagram of an electronic device used to implement embodiments of the present disclosure.
  • infectious disease models are mainly used, that is, using the spatiotemporal distribution of infected users, the transmission speed of infectious diseases, and the way of transmission are used for prediction.
  • this model requires sufficient knowledge of the epidemic and accurate grasp of the situation, as well as sufficient professional knowledge background.
  • the spread of the epidemic is often sudden, and the onset is delayed (for example, there is an incubation period, and patients within the incubation period will not develop typical symptoms), which may result in insufficient risk prediction accuracy.
  • infectious disease models often have predictive power for areas where outbreaks have already spread, but not for areas that have not yet.
  • FIG. 1 is a flowchart of a method for establishing a risk prediction model provided by an embodiment of the present disclosure. As shown in FIG. 1 , the method may include the following steps:
  • training data is acquired, where the training data includes a sample area set and a labeling result of the risk level of each sample area in the sample area set and the risk level of the region to which each sample area belongs.
  • various risk level regions in various risk level regions may be collected in advance as samples.
  • the scope of the region is greater than the scope of the region.
  • a region can be a province, a city, an administrative region, and so on.
  • An area can be a neighborhood, a street, a school, a building, a factory, and so on.
  • the risk level of the region can be divided into two types such as high risk and low risk, and can also be divided into high risk, medium risk, low risk, no risk and so on.
  • the risk level of the area can also be divided into two types, such as high risk and low risk, and can also be divided into high risk, medium risk, low risk, no risk and so on.
  • the specific division manner and division granularity are not limited in the present disclosure.
  • the risk level of each sample area and the risk level of each sample area can be marked in advance in the training data for use in the subsequent model training process.
  • an initial model including an encoding network, a discriminating network and a classification network is trained using the training data, and a risk prediction model is obtained by using the encoding network and the classification network in the initial model after the training is completed.
  • the coding network uses the regional features extracted from the sample region to encode the feature representation of each sample region; the discriminant network identifies the risk level of the region to which the sample region belongs based on the feature representation of the sample region; the classification network identifies the sample region based on the feature representation of the sample region.
  • the risk level of the region; the training objectives of the initial model include: minimizing the difference between the discriminant network's identification of sample regions belonging to regions with different risk levels, and minimizing the difference between the classification network's identification and labeling results of sample regions.
  • the present disclosure provides a method for establishing a risk prediction model, and based on the established risk prediction model, risk prediction for a target area can be realized, so that the spread of event hazards can be effectively prevented, and targeted measures can be taken. preventive measures.
  • step 101 it is assumed that the city is pre-divided into high-risk cities and low-risk cities, then select some high-risk cells and low-risk cells from the known high-risk cities, and select some high-risk cells and low-risk cells from the known low-risk cities. Pick some low-risk neighborhoods (usually there are no high-risk neighborhoods in low-risk cities).
  • the specific division method is determined by the infection and spread of the epidemic in cities and communities.
  • the selected cells are formed into a sample area set, and the city's risk level and the cell risk level are respectively marked for these cells, thereby constituting training data.
  • regional features can be extracted separately for each cell in the training data.
  • the regional features extracted in the present disclosure may include at least one of POI features, demographic features, and user travel features of the surrounding preset types. Different from existing infectious disease models, these regional features used in the present disclosure are not related to confirmed cases, so community risk prediction can also be performed in cities without previous experience where the epidemic has not broken out. These features are described in detail below.
  • the living facilities around a community usually have a relationship with the probability of the community being affected by the epidemic.
  • a community that lacks basic living facilities may face high risks, because residents may travel farther to obtain their living needs, so there is the possibility of infection on the road.
  • communities that lack basic living facilities are often poorly managed, which also leads to a high risk of infection.
  • the POI features of a preset type around a cell may include but are not limited to the following two:
  • the first type distance information between the cell and the nearest preset type POI.
  • more than one type of POI can be preset, such as hospitals, clinics, schools, preschool educational institutions, bus stations, subway stations, airports, railway stations, long-distance bus stations, shopping malls, supermarkets, markets, stores, public security bureaus, attractions, etc.
  • Features can be characterized by the distance of the cell to the nearest hospital, the distance to the nearest clinic, the distance to the nearest school, and so on.
  • the completeness of living facilities within 1 km can be used as one of the features. That is, it can be measured by using conditions such as hospitals, bus stops, supermarkets, shopping malls, markets, etc. within 1 km. For example, use 1 to represent the highest degree of completeness, and 0 to represent the lowest degree of completeness.
  • the distribution of commuting distances in the community can be used as one of the demographic characteristics.
  • the average commuting distance of the cell can be used to characterize it.
  • the commuting distance can refer to the distance to and from get off work, or the distance to and from school, and so on.
  • the user travel features involved in the present disclosure may include, but are not limited to, at least one of the following:
  • travel modes such as walking, cycling, public transportation, and private cars can be predefined.
  • the second: origin-destination pattern distribution can include information such as the type of destination and the distance between the origin and the destination. Destinations can be divided into hospitals, restaurants, hotels, schools, etc. in advance, and distance buckets can be pre-defined, such as multiple distance buckets within 3KM, 3km ⁇ 10km, 10km ⁇ 20km, etc., and the distance between the departure point and the destination can be calculated. Map to the corresponding distance bucket as a feature.
  • the third type origin-travel mode-destination mode distribution.
  • the origin refers to the local community
  • the travel mode and destination type can be defined in advance, and then the top N combinations of the travel mode and destination type of the local community are counted as features.
  • N is a preset positive integer, for example, 20 is selected.
  • the above features are relatively easy to obtain in any case, and can reflect the socioeconomic conditions and spatial interaction characteristics of an area on a finer granularity such as a cell, thereby realizing fine-grained identification of high-risk areas , reduce social costs.
  • the initial model may include an encoder network (Encoder), a discriminator network (Discriminator), a classification network (Classifier), and may also include a decoder network (Decoder).
  • Encoder an encoder network
  • Discriminator discriminator
  • Classifier classification network
  • Decoder decoder network
  • the regional features extracted from the sample cells are used as the input of the coding network. Since the sample cells belonging to cities with different risk levels will be used in the actual training process, in this embodiment, the sample cells of high-risk cities and low-risk cities are used. Take the sample area of the city as an example. by and Respectively represent the POI characteristics, demographic characteristics and user travel characteristics of the surrounding preset types of sample cells in high-risk cities. and Respectively represent the POI characteristics, demographic characteristics and user travel characteristics of the surrounding preset types of sample cells in low-risk cities. Will and After performing fusion methods such as splicing, the characteristics of sample cells in high-risk cities are obtained Will and After performing fusion methods such as splicing, the characteristics of sample cells in high-risk cities are obtained Will and After performing fusion methods such as splicing, the characteristics n L of the sample cells in low-risk cities are obtained.
  • n E is used as the input of the coding network, and the feature representation of the high-risk urban sample area is obtained after coding by the coding network.
  • n L is used as the input of the coding network, and the feature representation of the low-risk urban sample community is obtained after coding by the coding network.
  • the encoding network can be regarded as a new probability distribution after transforming the input feature vector.
  • the function of the discriminative model is based on the input Identify the risk level of the city from which this feature is derived, according to the input It is determined that this feature represents the city risk level from which it is derived.
  • An important training goal in the training process is to make the obtained feature representation as much as possible to make the discriminant model unable to distinguish the city from which it comes from after being encoded by the coding network, that is, to minimize the discriminant network for sample areas belonging to areas with different risk levels identification differences. This enables the encoding network to learn common features between cities.
  • a loss function can be constructed, called the second loss function L 2 , which can be used such as:
  • D() represents the recognition result of the discriminant model.
  • the discriminant model still needs to ensure its own function, that is, to identify the source city risk level. Therefore, an adversarial learning method can be used to construct another loss function, called the first loss function L 1 , which is used to train the discriminant model to minimize the difference between the recognition result of the sample region and the labeling result by the discriminant network. You can use something like:
  • the discriminative network continuously learns how to distinguish under the influence of L1 and The source city risk level, which will lead to an increase in L2. Then, under the influence of L2 , the encoding network tries to learn common features to reduce L2 , which causes the encoding network and the discriminant network to constantly confront each other in the learning process, and finally reach a balance. At this time, the discriminant network cannot distinguish sample cells in high-risk cities and low-risk cities, and the coding network learns the common characteristics between sample cells in high-risk cities and sample cells in low-risk cities.
  • the above-mentioned learning method can learn the common characteristics between the sample cells in high-risk cities and the sample cells in low-risk cities, it cannot learn the characteristics of the sample cells to guide the identification of the risk level of the cells. Therefore, in the initial model, this is identified by the classification network.
  • a loss function ie a third loss function L 3 .
  • y E represents the labeling result
  • y E represents the classification network for recognition result
  • the loss function is used to optimize the coding network and the classification network, so that the coding network can further learn the features that can guide the identification of the risk level of the community on the basis of learning the common features between cities. And guide the classification network to learn the ability to identify the risk identification level of the cell.
  • the above classification network is described by taking the binary classification as an example, but a multi-class classification network can also be used in the actual model.
  • the coding network-decoding network framework is added to perform feature reconstruction in the present disclosure.
  • the role of the coding network is to reconstruct the regional features using the input feature representation of the sample cells. That is, the vector representation is obtained by reconstructing n E The dimensions are consistent with n E. Reconstruct n L to get a vector representation The dimensions are the same as nL .
  • the optimal goal of the encoding network is to recover the original vector representation, that is, to minimize the difference between the reconstructed regional features and the regional features extracted from the sample regions. Accordingly, the fourth loss function L 4 can be constructed. You can use something like:
  • L4 is used to optimize the encoding network and decoding network, so that the feature representation learned by the encoding network still has the ability to describe the characteristics of a cell.
  • the above four loss functions are used to optimize and update the model parameters.
  • L 1 is used to optimize and update the parameters of the discriminant network
  • L 2 , L 3 and L 4 are used to optimize and update the parameters of the encoding network
  • L 3 and L 4 are used to optimize and update the classification network and decoding respectively. parameters of the network.
  • the risk prediction model is obtained from the training encoding network and classification network. That is to say, although the discriminative network and the decoding network are used for auxiliary training in the training process, only the encoding network and the classification network are used in the actual risk prediction model. As shown in Figure 3.
  • FIG. 4 is a regional risk prediction method provided by an embodiment of the present disclosure, and the method is implemented based on the above-established risk prediction model. As shown in Figure 4, the method includes:
  • the way of extracting regional features in this step is consistent with the regional features used in the process of training the risk prediction model. It may also include at least one of POI characteristics, demographic characteristics, and user travel characteristics of the surrounding preset types. For the content of the specific area features, refer to the relevant description in the embodiment shown in FIG. 1 , which will not be repeated here.
  • the region features are input into the risk prediction model, and the risk level of the target region is determined according to the output result of the risk prediction model.
  • n T is used as the input of the coding network, and the feature representation of the target cell is obtained after coding by the coding network Classification Network Basis Identify the risk level of the corresponding sample cell.
  • the information of the region to which the target area belongs is not required, and the risk level prediction is not related to the region.
  • the apparatus 500 may be an application located on the server side, or may also be a plug-in or software development kit (Software Development Kit, SDK) and other functions in the application located on the server side Alternatively, the unit may also be located at a computer terminal with relatively strong computing capability, which is not particularly limited in this embodiment of the present disclosure.
  • the apparatus 500 may include: a data acquisition unit 501 and a model training unit 502 , and may further include a feature extraction unit 503 .
  • the main functions of each unit are as follows:
  • the data acquisition unit 501 is configured to acquire training data, where the training data includes a sample area set and an annotation result of the risk level of each sample area in the sample area set and the risk level of the region to which each sample area belongs.
  • the model training unit 502 is used for training an initial model including an encoding network, a discriminating network and a classification network by using the training data, and after the training is completed, a risk prediction model is obtained by using the encoding network and the classification network in the initial model.
  • the coding network uses the regional characteristics of the sample area to encode the feature representation of each sample area; the discrimination network identifies the risk level of the area to which the sample area belongs based on the feature representation of the sample area; the classification network identifies the risk level of the sample area based on the feature representation of the sample area. Level; the training objectives of the initial model include: minimizing the difference between the discriminant network's recognition of sample areas belonging to areas with different risk levels, and minimizing the difference between the classification network's recognition results of sample areas and the labeling results.
  • the feature extraction unit 503 is configured to obtain the regional features of the sample area, including at least one of the following: POI features of the surrounding preset types, demographic features, and user travel features.
  • the POI features of the surrounding preset types include at least one of: distance information between the sample area and the nearest preset type POI, and the completeness of living facilities within the preset distance range of the sample area.
  • Demographic characteristics include: population density distribution, commuting distance distribution, age distribution, gender distribution, income distribution, spending power distribution, education level distribution, marital status distribution, life stage distribution, employment type distribution, and at least one of the industry type distributions. A sort of.
  • the travel characteristics of the user include at least one of: travel mode, departure-destination pattern distribution, and departure-travel mode-destination pattern distribution.
  • the above-mentioned initial model may further include a decoding network.
  • the decoding network reconstructs the regional features according to the feature representation of the sample region; the training objective also includes: minimizing the difference between the regional features reconstructed by the decoding network and the regional features extracted from the sample region.
  • the model training unit 502 uses the first loss function to optimize the parameters of the discriminant network, and uses the second loss function, the third loss function and the fourth loss function to optimize the parameters of the encoding network parameters, using the third loss function to optimize the parameters of the classification network, and using the fourth loss function to optimize the parameters of the decoding network.
  • the first loss function is used to minimize the difference between the recognition result and the labeling result of the sample region by the discriminant network.
  • the second loss function is used to minimize the discriminant network's identification differences for sample regions belonging to regions with different risk levels.
  • the third loss function is used to minimize the difference between the recognition result of the sample region and the labeling result of the classification network.
  • the fourth loss function is used to minimize the difference between the region features reconstructed by the decoding network and the region features extracted from the sample regions.
  • the device can be an application located on the server side, or can also be a plug-in or software development kit (Software Development Kit, SDK) and other functional units in the application located on the server side , or, it may also be located at a computer terminal with relatively strong computing capability, which is not particularly limited in this embodiment of the present disclosure.
  • the apparatus 600 may include: a feature extraction unit 601 and a risk prediction unit 602 .
  • the main functions of each unit are as follows:
  • the feature extraction unit 601 is used for extracting regional features of the target cell.
  • the risk prediction unit 602 is used for inputting the region features into the risk prediction model, and determining the risk level of the target region according to the output result of the risk prediction model.
  • the risk prediction model is pre-established by the device shown in FIG. 5 .
  • the regional risk level predicted by the above-mentioned regional risk prediction device is the risk level of epidemic spread.
  • the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.
  • FIG. 7 it is a block diagram of an electronic device according to an embodiment of the present disclosure.
  • Electronic devices are intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
  • Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.
  • the device 700 includes a computing unit 701 that can be executed according to a computer program stored in a read only memory (ROM) 702 or loaded into a random access memory (RAM) 703 from a storage unit 708 Various appropriate actions and handling. In the RAM 703, various programs and data required for the operation of the device 700 can also be stored.
  • the computing unit 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704.
  • An input/output (I/O) interface 705 is also connected to bus 704 .
  • Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706, such as a keyboard, mouse, etc.; an output unit 707, such as various types of displays, speakers, etc.; a storage unit 708, such as a magnetic disk, an optical disk, etc. ; and a communication unit 709, such as a network card, a modem, a wireless communication transceiver, and the like.
  • the communication unit 709 allows the device 700 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
  • Computing unit 701 may be various general-purpose and/or special-purpose processing components with processing and computing capabilities. Some examples of computing units 701 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various specialized artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc.
  • the computing unit 701 performs the various methods and processes described above, such as a method of establishing a risk prediction model or a regional risk prediction method. For example, in some embodiments, a method of building a risk prediction model or a regional risk prediction method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 708 .
  • part or all of the computer program may be loaded and/or installed on device 700 via ROM 802 and/or communication unit 709 .
  • the computer program When the computer program is loaded into RAM 703 and executed by computing unit 701, one or more steps of the above-described method of building a risk prediction model and regional risk prediction method can be performed.
  • the computing unit 701 may be configured by any other suitable means (eg, by means of firmware) to perform a method of building a risk prediction model or a method of regional risk prediction.
  • Various implementations of the systems and techniques described herein may be implemented in digital electronic circuitry, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on a chip System (SOC), Load Programmable Logic Device (CPLD), computer hardware, firmware, software, and/or combinations thereof.
  • FPGAs field programmable gate arrays
  • ASICs application specific integrated circuits
  • ASSPs application specific standard products
  • SOC system on a chip System
  • CPLD Load Programmable Logic Device
  • computer hardware firmware, software, and/or combinations thereof.
  • These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that
  • the processor which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.
  • Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, causes the execution of the flowcharts and/or block diagrams The function/operation is implemented.
  • the program code may execute entirely on the machine, partly on the machine, partly on the machine and partly on a remote machine as a stand-alone software package or entirely on the remote machine or server.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer.
  • a display device eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
  • a keyboard and pointing device eg, a mouse or trackball
  • Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.
  • the systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user's computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system.
  • the components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
  • a computer system can include clients and servers.
  • Clients and servers are generally remote from each other and usually interact through a communication network.
  • the relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Public Health (AREA)
  • Artificial Intelligence (AREA)
  • Primary Health Care (AREA)
  • Mathematical Physics (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Educational Administration (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • Computer Security & Cryptography (AREA)
  • Signal Processing (AREA)

Abstract

本公开公开了建立风险预测模型的方法、区域风险预测方法及对应装置,涉及人工智能技术领域下的大数据技术。具体实现方案为:获取训练数据,包括各样本区域的风险等级和所属地区风险等级的标注结果;利用训练数据训练包括编码网络、判别网络和分类网络的初始模型,训练完毕后利用编码网络和分类网络得到风险预测模型;其中,编码网络利用样本区域的区域特征编码得到各样本区域的特征表示;判别网络依据样本区域的特征表示识别样本区域所属地区的风险等级;分类网络依据样本区域的特征表示识别样本区域的风险等级;训练目标包括:最小化判别网络对属于不同风险等级地区的样本区域的识别差异,最小化分类网络对样本区域的识别结果与标注结果的差异。

Description

建立风险预测模型的方法、区域风险预测方法及对应装置
本申请要求了申请日为2020年12月21日,申请号为2020115159533发明名称为“建立风险预测模型的方法、区域风险预测方法及对应装置”的中国专利申请的优先权。
技术领域
本公开涉及计算机应用技术领域,尤其涉及人工智能技术领域下的大数据技术。
背景技术
突发公共事件对于人们的生产、生活甚至安全都带来极大影响,例如疫情的传播、生物灾害、气象灾害等等。如果能够及时、准确地对区域风险进行预测,则能够有效地防止事件危害的扩散,并采取有针对性的预防措施,具有重大意义。
发明内容
本公开提供了一种建立风险预测模型的方法、区域风险预测方法及对应装置,以便于实现区域风险预测。
根据本公开的第一方面,提供了一种建立风险预测模型的方法,包括:
获取训练数据,所述训练数据包括样本区域集以及对所述样本区域集中各样本区域的风险等级和各样本区域所属地区的风险等级的标注结果;
利用所述训练数据训练包括编码网络、判别网络和分类网络的初始模型,训练完毕后利用所述初始模型中的编码网络和分类网络得到所述风险预测模型;
其中,所述编码网络利用样本区域的区域特征,编码得到各样本区域的特征表示;所述判别网络依据样本区域的特征表示识别样本区域所属地区的风险等级;所述分类网络依据样本区域的特征表示识别样本区域的风险等级;所述初始模型的训练目标包括:最小化所述判别网络对 属于不同风险等级地区的样本区域的识别差异,最小化所述分类网络对样本区域的识别结果与标注结果的差异。
根据本公开的第二方面,提供了一种区域风险预测方法,包括:
提取目标小区的区域特征;
将所述区域特征输入风险预测模型,依据所述风险预测模型输出的结果确定所述目标区域的风险等级;
其中所述风险预测模型采用如上所述的方法预先建立。
根据本公开的第三方面,提供了一种建立风险预测模型的装置,包括:
数据获取单元,用于获取训练数据,所述训练数据包括样本区域集以及对所述样本区域集中各样本区域的风险等级和各样本区域所属地区的风险等级的标注结果;
模型训练单元,用于利用所述训练数据训练包括编码网络、判别网络和分类网络的初始模型,训练完毕后利用所述初始模型中的编码网络和分类网络得到所述风险预测模型;
其中,所述编码网络利用样本区域的区域特征,编码得到各样本区域的特征表示;所述判别网络依据样本区域的特征表示识别样本区域所属地区的风险等级;所述分类网络依据样本区域的特征表示识别样本区域的风险等级;所述初始模型的训练目标包括:最小化所述判别网络对属于不同风险等级地区的样本区域的识别差异,最小化所述分类网络对样本区域的识别结果与标注结果的差异
根据本公开的第四方面,提供了一种区域风险预测装置,包括:
特征提取单元,用于提取目标小区的区域特征;
风险预测单元,用于将所述区域特征输入风险预测模型,依据所述风险预测模型输出的结果确定所述目标区域的风险等级;
其中所述风险预测模型由如上所述的装置预先建立。
根据本公开的第五方面,提供了一种电子设备,包括:
至少一个处理器;以及
与所述至少一个处理器通信连接的存储器;其中,
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如上所 述的方法。
根据本公开的第六方面,提供了一种存储有计算机指令的非瞬时计算机可读存储介质,其中,所述计算机指令用于使所述计算机执行如上所述的方法。
根据本公开的第七方面,提供了一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现如上所述的方法。
由以上技术方案可以看出,本公开提供了建立风险预测模型的方法,基于该建立的风险预测模型能够实现针对目标区域的风险预测,从而能够有效地防止事件危害的扩散,并采取有针对性的预防措施。
应当理解,本部分分所描述的内容并非旨在标识本公开的实施例的关键或重要特征,也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。
附图说明
附图用于更好地理解本方案,不构成对本公开的限定。其中:
图1为本公开实施例提供的建立风险预测模型的方法流程图;
图2为本公开实施例提供的训练初始模型的结构示意图;
图3为本公开实施例提供的风险预测模型的结构示意图;
图4为本公开实施例提供的区域风险预测方法;
图5为本公开提供的建立风险预测模型的装置结构图;
图6为本公开提供的区域风险预测装置的结构图;
图7是用来实现本公开实施例的电子设备的框图。
具体实施方式
以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。
在目前已有的针对诸如疫情这类突发公共事件的风险预测方式中,主要采用的是诸如传染病模型,即利用已感染用户的时空分布、传染病 的传播速度、传播途径等进行预测。然而这种模型需要对疫情具有足够的了解和准确情况的掌握,并具有足够的专业知识背景。但往往疫情的传播是突发的、发病具有滞后性(例如具有潜伏期,在潜伏期内的患者不会出现典型症状),这就可能造成风险预测的准确性不足。另外,这种传染病模型通常对于已出现疫情传播的地区具有预测能力,但对于尚未出现疫情的地区则无法预测。
本公开所提供的建立风险预测模型的方式中,通过学习不同风险等级地区中区域的特征以及不同风险等级区域的自身特征,能够实现基于这些特征对未知风险状况的区域进行风险等级的预测。下面结合实施例对本公开提供的方式进行详细描述。
图1为本公开实施例提供的建立风险预测模型的方法流程图,如图1中所示,该方法可以包括以下步骤:
在101中,获取训练数据,训练数据包括样本区域集以及对样本区域集中各样本区域的风险等级和各样本区域所属地区的风险等级的标注结果。
在本公开中可以预先收集各类风险等级地区中各类风险等级区域作为样本。其中,地区的范围大于区域的范围。例如地区可以是一个省、一个城市、一个行政区等等。区域可以是一个小区、一个街道、一个学校、一个建筑、一个工厂等等。
地区的风险等级可以划分为诸如高风险、低风险两种,也可以划分为高风险、中风险、低风险、无风险等多种。区域的风险等级也可以划分为诸如高风险、低风险两种,也可以划分为高风险、中风险、低风险、无风险等多种。具体的划分方式和划分粒度在本公开中不加以限制。
在训练数据中可以预先对各样本区域所述地区的风险等级和各样本区域的风险等级进行标注,以在后续模型训练过程中使用。
在102中,利用训练数据训练包括编码网络、判别网络和分类网络的初始模型,训练完毕后利用初始模型中的编码网络和分类网络得到风险预测模型。
其中,编码网络利用从样本区域中提取的区域特征,编码得到各样本区域的特征表示;判别网络依据样本区域的特征表示识别样本区域所属地区的风险等级;分类网络依据样本区域的特征表示识别样本区域的 风险等级;初始模型的训练目标包括:最小化判别网络对属于不同风险等级地区的样本区域的识别差异,最小化分类网络对样本区域的识别结果与标注结果的差异。
通过上述技术方案可以看出,本公开提供了建立风险预测模型的方法,基于该建立的风险预测模型能够实现针对目标区域的风险预测,从而能够有效地防止事件危害的扩散,并采取有针对性的预防措施。
下面结合实施例对上述实施例中的各步骤进行详细描述。另外,鉴于本公开所提供的方式能够很好地应用于疫情风险预测,因此后续实施例中将以疫情风险预测为例进行描述。
在上述步骤101中,假设预先将城市划分为高风险城市和低风险城市两种,那么分别从已知的高风险城市中选取一些高风险小区和低风险小区,从已知的低风险城市中选取一些低风险小区(通常低风险城市中没有高风险小区)。具体的划分方式以疫情在城市和小区中的感染和传播状况确定。将选取的这些小区构成样本区域集,并针对这些小区分别标注所述城市的风险等级和小区风险等级,从而构成训练数据。
更进一步地,可以针对训练数据中的各小区分别提取区域特征。在本公开中提取的区域特征可以包括周边预设类型的POI特征、人口统计学特征和用户出行特征中的至少一种。与现有传染病模型不同的是,本公开中采用的这些区域特征与确诊病例不相关,因此在没有前期经验的疫情未爆发城市同样可以进行小区风险的预测。下面分别对这几种特征进行详细描述。
周边预设类型的POI特征:
一个小区周围的生活设施通常与该小区受疫情影响的概率存在关系。例如一个小区缺乏基本生活设施可能面临高风险,因为居民可能去更远的地方获取生活所需,那么就存在在路上感染的可能。并且缺乏基本生活设施的小区通常缺乏很好的管理,同样会导致感染的高风险。
基于以上考虑,一个小区周边预设类型的POI特征可以包括但不限于以下两种:
第一种:小区与最近的预设类型POI的距离信息。在本公开中可以预先设置一个以上种类型的POI,例如医院、诊所、学校、学前教育机构、公交站、地铁站、机场、火车站、长途客运站、购物商场、超市、 市场、商店,公安局、景点等。特征可以采用小区与最近的医院的距离、与最近的诊所的距离、与最近的学校的距离,等等进行表征。
第二种:小区预设距离范围内生活设施的完备程度。本公开中可以采用例如1公里内生活设施的完备程度来作为其中一个特征。即可以利用1公里以内具有诸如医院、公交车站、超市、购物商场、市场等等的状况来衡量。例如用1表示完备程度最高,0表示完备程度最低。
人口统计学特征:
由于疫情通常以人传人的方式传播,因此需要考虑人口密度来进行风险预测。通常人口密度较高的小区比人口密度低的小区具有更高的感染风险。因此可以将人口密度作为其中一种人口统计学特征。
另外,不同的通勤距离对于疫情的风险也具有一定影响,因此可以将小区的通勤距离分布作为其中一个人口统计学特征。作为其中一种实现方式,可以采用小区的平均通勤距离来表征。其中通勤距离可以指代上下班的距离,也可以指代上下学的距离等等。
不同的群体通常在面临疫情时受感染的几率也不同,例如年龄大的人和年龄小的人通常因为抵抗力弱更容易感染。再例如,受教育程度较高的人对于风险的了解和防范程度也更高,因此受感染的几率相对较小。等等。基于此考虑,可以将年龄分布、性别分布、收入分布、消费能力分布、受教育程度分布、婚姻状况分布、生活阶段分布、从业类型分布、行业类型分布等从中选择至少一个来作为人口统计学特征。
用户出行特征:
有些相关研究证明,用户出行行为通常与疫情传播存在很紧密的关系。本公开中涉及的用户出行特征可以包括但不限于以下中的至少一种:
第一种:出行方式。例如可以预先定义步行、骑行、公共交通、私家车等出行方式。
第二种:出发地-目的地模式分布。可以包括目的地的类型以及出发地到目的地之间的距离等信息。可以预先将目的地划分为医院、饭店、酒店、学校等类型,并预先定义距离桶,例如3KM以内、3km~10km、10km~20km等多个距离桶,并将出发地到目的之间的距离映射至对应的距离桶,以此作为特征。
第三种:出发地-出行方式-目的地模式分布。其中,出发地指代的 就是本小区,出行方式和目的地类型可以预先定义,然后统计出本小区的出行方式和目的地类型构成的组合中,排在前N个的组合作为特征。其中N为预设的正整数,例如选取20。
可以看出,上述特征在任何情况下都是相对容易获取的,并且可以在小区这样一个较细粒度上反映一个区域的社会经济情况和空间交互活动的特点,进而实现细粒度的高风险区域识别,降低社会成本。
下面结合实施例对上述步骤102进行详细描述。首先对训练过程中采用的初始模型的结构进行描述。如图2中所示,该初始模型可以包括:编码网络(Encoder)、判别网络(Discriminator)和分类网络(Classifier),还可以包括解码网络(Decoder)。
其中,从样本小区中提取的区域特征作为编码网络的输入,由于在实际训练过程中会使用到属于不同风险级别城市的样本小区,因此在本实施例中以高风险城市的样本小区和低风险城市的样本小区为例。以
Figure PCTCN2021097958-appb-000001
Figure PCTCN2021097958-appb-000002
Figure PCTCN2021097958-appb-000003
分别表示高风险城市的样本小区的周边预设类型的POI特征、人口统计学特征和用户出行特征,以
Figure PCTCN2021097958-appb-000004
Figure PCTCN2021097958-appb-000005
分别表示低风险城市的样本小区的周边预设类型的POI特征、人口统计学特征和用户出行特征。将
Figure PCTCN2021097958-appb-000006
Figure PCTCN2021097958-appb-000007
进行诸如拼接等融合方式后,得到高风险城市的样本小区的特征
Figure PCTCN2021097958-appb-000008
Figure PCTCN2021097958-appb-000009
Figure PCTCN2021097958-appb-000010
进行诸如拼接等融合方式后,得到低风险城市的样本小区的特征n L
n E作为编码网络的输入,经由编码网络编码后得到该高风险城市样本小区的特征表示
Figure PCTCN2021097958-appb-000011
同样,n L作为编码网络的输入,经由编码网络编码后得到该低风险城市样本小区的特征表示
Figure PCTCN2021097958-appb-000012
编码网络可以看做是对输入的特征向量进行变换后,得到新的概率分布。
一般来说,若希望从已经大规模爆发的城市(即高风险城市)中学习经验,则往往这些经验需要在不同城市之间具有一定的共性,而并非城市独有的特点。如何学习到这些共性的特征是模型训练过程中一个非常重要的问题。在本公开中,通过训练判别模型来解决该问题。
判别模型的功能是在依据输入的
Figure PCTCN2021097958-appb-000013
判别出该特征表示所来源的城市风险等级,依据输入的
Figure PCTCN2021097958-appb-000014
判别出该特征表示所来源的城市风险等级。在训练过程中一个重要的训练目标为,经过编码网络的编码后,使得得到的特征表示能够尽量使得判别模型无法区分其所来源的城市,即最小 化判别网络对属于不同风险等级地区的样本区域的识别差异。这样就能够使得编码网络学习到城市之间共性的特征。根据该训练目标可以构建损失函数,称为第二损失函数L 2,可以采用诸如:
Figure PCTCN2021097958-appb-000015
其中,D()表示判别模型的识别结果。
更进一步地,除了学习城市之间的共性特征之外,判别模型仍需要保证自身的功能,即识别所来源的城市风险等级。因此可以采用对抗学习的方式,再构建一个损失函数,称为第一损失函数L 1,该损失函数用以训练判别模型,以最小化判别网络对样本区域的识别结果与标注结果的差异。可以采用诸如:
Figure PCTCN2021097958-appb-000016
在对抗学习的过程中,判别网络在L 1的影响下不断学习如何区分
Figure PCTCN2021097958-appb-000017
Figure PCTCN2021097958-appb-000018
所来源的城市风险等级,这就会导致L 2的升高。进而编码网络在L 2的影响下尽量学习共性特征来使得L 2降低,这就造成了编码网络和判别网络在学习过程中不断对抗,最终达到平衡。此时判别网络不能够区分高风险城市和低风险城市中的样本小区,编码网络学习到了高风险城市的样本小区和低风险城市的样本小区之间的共性特征。
采用上述的学习方式虽然能够学习到高风险城市的样本小区和低风险城市的样本小区之间的共性特征,但不能够学习到样本小区的特征来指导小区风险级别的识别。因此,在初始模型中,通过分类网络对此进行识别。
分类网络依据
Figure PCTCN2021097958-appb-000019
识别出对应样本小区的风险等级。训练目标为最小化分类网络对样本区域的识别结果与标注结果的差异。就此可以构建损失函数,即第三损失函数L 3。可以采用诸如:
Figure PCTCN2021097958-appb-000020
其中,y E表示标注结果,
Figure PCTCN2021097958-appb-000021
表示分类网络针对
Figure PCTCN2021097958-appb-000022
的识别结果。
利用该损失函数优化编码网络和分类网络,使得编码网络在学习到城市间共性特征的基础上,进一步学习到能够指导识别小区风险等级的特征。且指导分类网络学习对于小区风险识别等级的识别能力。另外需要说明的是,上述分类网络是以二分类为例进行的描述,但在实际模型中也可以采用多分类的分类网络。
更进一步地,为了尽可能使得编码网络学习到小区特点的特征,在本公开中增加了编码网络-解码网络框架进行特征重构。
编码网络的作用在于利用输入的样本小区的特征表示对区域特征进行重构。即对n E进行重构得到向量表示
Figure PCTCN2021097958-appb-000023
维度与n E一致。对n L进行重构得到向量表示
Figure PCTCN2021097958-appb-000024
维度与n L一致。编码网络的最优目标是恢复出原始的向量表示,即最小化重构得到的区域特征与从样本区域中提取的区域特征的差异。据此可以构建第四损失函数L 4。可以采用诸如:
Figure PCTCN2021097958-appb-000025
利用L 4优化编码网络和解码网络,使得编码网络学习到的特征表示仍然具有描述一个小区特点的能力。
综上可以看出,作为一种优选的实施方式,在训练初始模型的过程中,利用上述四种损失函数进行模型参数的优化更新。具体地,在每一轮迭代过程中,采用L 1优化更新判别网络的参数,采用L 2、L 3和L 4优化更新编码网络的参数,采用L 3和L 4分别优化更新分类网络和解码网络的参数。
在初始模型训练结束后,例如模型收敛或者达到预设迭代次数等条件后,由训练得到的编码网络和分类网络得到风险预测模型。也就是说,虽然在训练过程中采用了判别网络、解码网络进行辅助训练,但实际得到的风险预测模型中仅仅使用编码网络和分类网络。如图3中所示。
图4为本公开实施例提供的区域风险预测方法,该方法基于上述已建立的风险预测模型实现。如图4中所示,该方法包括:
在401中,获取目标小区的区域特征。
本步骤中提取区域特征的方式与训练风险预测模型过程中采用的区域特征一致。同样可以包括周边预设类型的POI特征、人口统计学特征和用户出行特征中的至少一种。具体区域特征的内容参见图1所示实施例中的相关描述,在此不做赘述。
在402中,将区域特征输入风险预测模型,依据风险预测模型输出的结果确定目标区域的风险等级。
如图3中所示,将
Figure PCTCN2021097958-appb-000026
Figure PCTCN2021097958-appb-000027
分别表示高风险城市的样本小区的周边预设类型的POI特征、人口统计学特征和用户出行特征。将
Figure PCTCN2021097958-appb-000028
Figure PCTCN2021097958-appb-000029
进行诸如拼接等融合方式后,得到目标小区的特征n T
n T作为编码网络的输入,经由编码网络编码后得到该目标小区的特征表示
Figure PCTCN2021097958-appb-000030
分类网络依据
Figure PCTCN2021097958-appb-000031
识别出对应样本小区的风险等级。
可以看出,上述目标区域的风险等级预测过程中,并不需要该目标区域所属的地区信息,是与地区无关的风险等级预测。
作为本公开所应用的一个典型应用场景,可以用以进行疫情传播的区域风险等级预测。本方案对于未大规模爆发疫情的地区中也能够识别出潜在的高风险区域,从而对于疫情的预防和控制具有重要指导意义。
以上是对本公开所提供的方法进行的详细描述,下面结合实施例对本公开提供的装置进行详细描述。
图5为本公开提供的建立风险预测模型的装置结构图,该装置可以为位于服务器端的应用,或者还可以为位于服务器端的应用中的插件或软件开发工具包(Software Development Kit,SDK)等功能单元,或者,还可以位于具有较强计算能力的计算机终端,本公开实施例对此不进行特别限定。如图5中所示,该装置500可以包括:数据获取单元501和模型训练单元502,还可以包括特征提取单元503。其中各组成单元的主要功能如下:
数据获取单元501,用于获取训练数据,训练数据包括样本区域集以及对样本区域集中各样本区域的风险等级和各样本区域所属地区的风险等级的标注结果。
模型训练单元502,用于利用训练数据训练包括编码网络、判别网络和分类网络的初始模型,训练完毕后利用初始模型中的编码网络和分类网络得到风险预测模型。
其中,编码网络利用样本区域的区域特征,编码得到各样本区域的特征表示;判别网络依据样本区域的特征表示识别样本区域所属地区的风险等级;分类网络依据样本区域的特征表示识别样本区域的风险等级;初始模型的训练目标包括:最小化判别网络对属于不同风险等级地区的样本区域的识别差异,最小化分类网络对样本区域的识别结果与标注结果的差异。
特征提取单元503,用于获取样本区域的区域特征,包括以下至少一种:周边预设类型的POI特征、人口统计学特征和用户出行特征。
其中,周边预设类型的POI特征包括:样本区域与最近的预设类型 POI的距离信息、样本区域预设距离范围内生活设施的完备程度中的至少一种。
人口统计学特征包括:人口密度状况、通勤距离分布、年龄分布、性别分布、收入分布、消费能力分布、受教育程度分布、婚姻状况分布、生活阶段分布、从业类型分布、行业类型分布中的至少一种。
用户出行特征包括:出行方式、出发地-目的地模式分布、出发地-出行方式-目的地模式分布中的至少一种。
作为一种优选的实施方式,上述初始模型还可以包括解码网络。解码网络依据样本区域的特征表示对区域特征进行重构;训练目标还包括:最小化解码网络重构得到的区域特征与从样本区域中提取的区域特征的差异。
作为一种优选的实施方式,模型训练单元502在训练初始模型的过程中,利用第一损失函数优化判别网络的参数,利用第二损失函数、第三损失函数和第四损失函数优化编码网络的参数,利用第三损失函数优化分类网络的参数,利用第四损失函数优化解码网络的参数。
第一损失函数用以最小化判别网络对样本区域的识别结果与标注结果的差异。
第二损失函数用以最小化判别网络对属于不同风险等级地区的样本区域的识别差异。
第三损失函数用以最小化分类网络对样本区域的识别结果与标注结果的差异。
第四损失函数用以最小化解码网络重构得到的区域特征与从样本区域中提取的区域特征的差异。
图6为本公开提供的区域风险预测装置的结构图,该装置可以为位于服务器端的应用,或者还可以为位于服务器端的应用中的插件或软件开发工具包(Software Development Kit,SDK)等功能单元,或者,还可以位于具有较强计算能力的计算机终端,本公开实施例对此不进行特别限定。如图6中所示,该装置600可以包括:特征提取单元601和风险预测单元602。其中各组成单元的主要功能如下:
特征提取单元601,用于提取目标小区的区域特征。
风险预测单元602,用于将区域特征输入风险预测模型,依据风险 预测模型输出的结果确定目标区域的风险等级。
其中风险预测模型由图5中所示的装置预先建立。
作为一种典型的应用场景,上述区域风险预测装置预测的区域风险等级为疫情传播的风险等级。
本公开中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
在此需要说明的是本公开可以适用于诸如疫情传播的风险等级预测这一典型的应用场景,但除此应用场景之外,也可以在本公开的思路范围内合理扩展应用于其他场景。当应用于其他应用场景时对应提取的区域特征会有所不同。
根据本公开的实施例,本公开还提供了一种电子设备、一种可读存储介质和一种计算机程序产品。
如图7所示,是根据本公开实施例的电子设备的框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本公开的实现。
如图7所示,设备700包括计算单元701,其可以根据存储在只读存储器(ROM)702中的计算机程序或者从存储单元708加载到随机访问存储器(RAM)703中的计算机程序,来执行各种适当的动作和处理。在RAM 703中,还可存储设备700操作所需的各种程序和数据。计算单元701、ROM 702以及RAM 703通过总线704彼此相连。输入/输出(I/O)接口705也连接至总线704。
设备700中的多个部件连接至I/O接口705,包括:输入单元706,例如键盘、鼠标等;输出单元707,例如各种类型的显示器、扬声器等;存储单元708,例如磁盘、光盘等;以及通信单元709,例如网卡、调制解调器、无线通信收发机等。通信单元709允许设备700通过诸如因特 网的计算机网络和/或各种电信网络与其他设备交换信息/数据。
计算单元701可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元701的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元701执行上文所描述的各个方法和处理,例如建立风险预测模型的方法或区域风险预测方法。例如,在一些实施例中,建立风险预测模型的方法或区域风险预测方法可被实现为计算机软件程序,其被有形地包含于机器可读介质,例如存储单元708。
在一些实施例中,计算机程序的部分或者全部可以经由ROM 802和/或通信单元709而被载入和/或安装到设备700上。当计算机程序加载到RAM 703并由计算单元701执行时,可以执行上文描述的建立风险预测模型的方法和区域风险预测方法的一个或多个步骤。备选地,在其他实施例中,计算单元701可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行建立风险预测模型的方法或区域风险预测方法。
此处描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。
用于实施本公开的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控30制器执行时使流程图和/或框图中所规定的功能/操作被实施。程 序代码可完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)和互联网。
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼 此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本发申请中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本公开公开的技术方案所期望的结果,本文在此不进行限制。
上述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。

Claims (17)

  1. 一种建立风险预测模型的方法,包括:
    获取训练数据,所述训练数据包括样本区域集以及对所述样本区域集中各样本区域的风险等级和各样本区域所属地区的风险等级的标注结果;
    利用所述训练数据训练包括编码网络、判别网络和分类网络的初始模型,训练完毕后利用所述初始模型中的编码网络和分类网络得到所述风险预测模型;
    其中,所述编码网络利用样本区域的区域特征,编码得到各样本区域的特征表示;所述判别网络依据样本区域的特征表示识别样本区域所属地区的风险等级;所述分类网络依据样本区域的特征表示识别样本区域的风险等级;所述初始模型的训练目标包括:最小化所述判别网络对属于不同风险等级地区的样本区域的识别差异,最小化所述分类网络对样本区域的识别结果与标注结果的差异。
  2. 根据权利要求1所述的方法,其中,所述样本区域的区域特征包括以下至少一种:
    周边预设类型的POI特征、人口统计学特征和用户出行特征。
  3. 根据权利要求1所述的方法,其中,所述周边预设类型的POI特征包括:样本区域与最近的预设类型POI的距离信息、样本区域预设距离范围内生活设施的完备程度中的至少一种;
    所述人口统计学特征包括:人口密度状况、通勤距离分布、年龄分布、性别分布、收入分布、消费能力分布、受教育程度分布、婚姻状况分布、生活阶段分布、从业类型分布、行业类型分布中的至少一种;
    所述用户出行特征包括:出行方式、出发地-目的地模式分布、出发地-出行方式-目的地模式分布中的至少一种。
  4. 根据权利要求1所述的方法,其中,所述初始模型还包括解码网络;
    所述解码网络依据样本区域的特征表示对区域特征进行重构;
    所述训练目标还包括:最小化所述解码网络重构得到的区域特征与从样本区域中提取的区域特征的差异。
  5. 根据权利要求4所述的方法,其中在训练所述初始模型的过程中,利用第一损失函数优化所述判别网络的参数,利用第二损失函数、第三损失函数和第四损失函数优化所述编码网络的参数,利用所述第三损失函数优化所述分类网络的参数,利用所述第四损失函数优化所述解码网络的参数;
    所述第一损失函数用以最小化所述判别网络对样本区域的识别结果与标注结果的差异;
    所述第二损失函数用以最小化所述判别网络对属于不同风险等级地区的样本区域的识别差异;
    所述第三损失函数用以最小化所述分类网络对样本区域的识别结果与标注结果的差异;
    所述第四损失函数用以最小化所述解码网络重构得到的区域特征与从样本区域中提取的区域特征的差异。
  6. 一种区域风险预测方法,包括:
    提取目标小区的区域特征;
    将所述区域特征输入风险预测模型,依据所述风险预测模型输出的结果确定所述目标区域的风险等级;
    其中所述风险预测模型采用如权利要求1至5中任一项所述的方法预先建立。
  7. 根据权利要求6所述的方法,其中,所述风险等级为疫情传播的风险等级。
  8. 一种建立风险预测模型的装置,包括:
    数据获取单元,用于获取训练数据,所述训练数据包括样本区域集以及对所述样本区域集中各样本区域的风险等级和各样本区域所属地区的风险等级的标注结果;
    模型训练单元,用于利用所述训练数据训练包括编码网络、判别网络和分类网络的初始模型,训练完毕后利用所述初始模型中的编码网络和分类网络得到所述风险预测模型;
    其中,所述编码网络利用样本区域的区域特征,编码得到各样本区域的特征表示;所述判别网络依据样本区域的特征表示识别样本区域所属地区的风险等级;所述分类网络依据样本区域的特征表示识别样本区 域的风险等级;所述初始模型的训练目标包括:最小化所述判别网络对属于不同风险等级地区的样本区域的识别差异,最小化所述分类网络对样本区域的识别结果与标注结果的差异。
  9. 根据权利要求8所述的装置,还包括:
    特征提取单元,用于获取所述样本区域的区域特征,包括以下至少一种:周边预设类型的POI特征、人口统计学特征和用户出行特征。
  10. 根据权利要求8所述的装置,其中,所述周边预设类型的POI特征包括:样本区域与最近的预设类型POI的距离信息、样本区域预设距离范围内生活设施的完备程度中的至少一种;
    所述人口统计学特征包括:人口密度状况、通勤距离分布、年龄分布、性别分布、收入分布、消费能力分布、受教育程度分布、婚姻状况分布、生活阶段分布、从业类型分布、行业类型分布中的至少一种;
    所述用户出行特征包括:出行方式、出发地-目的地模式分布、出发地-出行方式-目的地模式分布中的至少一种。
  11. 根据权利要求8所述的装置,其中,所述初始模型还包括解码网络;
    所述解码网络依据样本区域的特征表示对区域特征进行重构;
    所述训练目标还包括:最小化所述解码网络重构得到的区域特征与从样本区域中提取的区域特征的差异。
  12. 根据权利要求11所述的装置,其中,所述模型训练单元在训练所述初始模型的过程中,利用第一损失函数优化所述判别网络的参数,利用第二损失函数、第三损失函数和第四损失函数优化所述编码网络的参数,利用所述第三损失函数优化所述分类网络的参数,利用所述第四损失函数优化所述解码网络的参数;
    所述第一损失函数用以最小化所述判别网络对样本区域的识别结果与标注结果的差异;
    所述第二损失函数用以最小化所述判别网络对属于不同风险等级地区的样本区域的识别差异;
    所述第三损失函数用以最小化所述分类网络对样本区域的识别结果与标注结果的差异;
    所述第四损失函数用以最小化所述解码网络重构得到的区域特征与 从样本区域中提取的区域特征的差异。
  13. 一种区域风险预测装置,包括:
    特征提取单元,用于提取目标小区的区域特征;
    风险预测单元,用于将所述区域特征输入风险预测模型,依据所述风险预测模型输出的结果确定所述目标区域的风险等级;
    其中所述风险预测模型由如权利要求8至12中任一项所述的装置预先建立。
  14. 根据权利要求13所的装置,其中,所述风险等级为疫情传播的风险等级。
  15. 一种电子设备,包括:
    至少一个处理器;以及
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-7中任一项所述的方法。
  16. 一种存储有计算机指令的非瞬时计算机可读存储介质,其中,所述计算机指令用于使所述计算机执行权利要求1-7中任一项所述的方法。
  17. 一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现根据权利要求1-7中任-项所述的方法。
PCT/CN2021/097958 2020-12-21 2021-06-02 建立风险预测模型的方法、区域风险预测方法及对应装置 WO2022134480A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2021576944A JP2023510665A (ja) 2020-12-21 2021-06-02 リスク予測モデルの確立方法、領域リスク予測方法、及び対応装置
EP21820072.3A EP4040353B1 (en) 2020-12-21 2021-06-02 Method for establishing risk prediction model, regional risk prediction method and corresponding apparatus
KR1020217042672A KR20220093046A (ko) 2020-12-21 2021-06-02 리스크 예측 모델의 구축 방법, 구역 리스크 예측 방법 및 대응 장치
US17/620,820 US20220398465A1 (en) 2020-12-21 2021-06-02 Method and apparatus for establishing risk prediction model as well as regional risk prediction method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011515953.3A CN112508300B (zh) 2020-12-21 2020-12-21 建立风险预测模型的方法、区域风险预测方法及对应装置
CN202011515953.3 2020-12-21

Publications (1)

Publication Number Publication Date
WO2022134480A1 true WO2022134480A1 (zh) 2022-06-30

Family

ID=74921829

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/097958 WO2022134480A1 (zh) 2020-12-21 2021-06-02 建立风险预测模型的方法、区域风险预测方法及对应装置

Country Status (6)

Country Link
US (1) US20220398465A1 (zh)
EP (1) EP4040353B1 (zh)
JP (1) JP2023510665A (zh)
KR (1) KR20220093046A (zh)
CN (1) CN112508300B (zh)
WO (1) WO2022134480A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115935265A (zh) * 2023-03-03 2023-04-07 支付宝(杭州)信息技术有限公司 训练风险识别模型的方法、风险识别方法及对应装置

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508300B (zh) * 2020-12-21 2023-04-18 北京百度网讯科技有限公司 建立风险预测模型的方法、区域风险预测方法及对应装置
CN113744888B (zh) * 2021-09-02 2023-09-22 深圳万海思数字医疗有限公司 区域流行病趋势预测预警方法及系统
CN113837588B (zh) * 2021-09-17 2023-12-29 北京百度网讯科技有限公司 一种评估模型的训练方法、装置、电子设备及存储介质
CN114372642B (zh) * 2022-03-21 2022-05-20 创意信息技术股份有限公司 一种城市节假日旅游景区风险评估的方法
CN115983142B (zh) * 2023-03-21 2023-08-29 之江实验室 基于深度生成对抗式网络的区域人口演化模型构造方法
CN116028964B (zh) * 2023-03-28 2023-05-23 中国标准化研究院 一种信息安全风险管理系统
CN117421244B (zh) * 2023-11-17 2024-05-24 北京邮电大学 多源跨项目软件缺陷预测方法、装置及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170242972A1 (en) * 2016-02-19 2017-08-24 International Business Machines Corporation Method for proactive comprehensive geriatric risk screening
US20190130212A1 (en) * 2017-10-30 2019-05-02 Nec Laboratories America, Inc. Deep Network Embedding with Adversarial Regularization
CN110993119A (zh) * 2020-03-04 2020-04-10 同盾控股有限公司 基于人口迁移的疫情预测方法、装置、电子设备及介质
CN111128399A (zh) * 2020-03-30 2020-05-08 广州地理研究所 一种基于人流密度的流行病疫情风险等级评估方法
CN111768873A (zh) * 2020-06-03 2020-10-13 中国地质大学(武汉) 一种covid-19实时风险预测方法
CN112508300A (zh) * 2020-12-21 2021-03-16 北京百度网讯科技有限公司 建立风险预测模型的方法、区域风险预测方法及对应装置

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109902880A (zh) * 2019-03-13 2019-06-18 南京航空航天大学 一种基于Seq2Seq生成对抗网络的城市人流预测方法
CN110458572B (zh) * 2019-07-08 2023-11-24 创新先进技术有限公司 用户风险的确定方法和目标风险识别模型的建立方法
CN110674979A (zh) * 2019-09-11 2020-01-10 腾讯科技(深圳)有限公司 风险预测模型的训练方法、预测方法及装置、介质和设备
CN110689184A (zh) * 2019-09-21 2020-01-14 广东毓秀科技有限公司 一种通过深度学习进行轨交人流预测的方法
CN111523596B (zh) * 2020-04-23 2023-07-04 北京百度网讯科技有限公司 目标识别模型训练方法、装置、设备以及存储介质
CN111523597B (zh) * 2020-04-23 2023-08-25 北京百度网讯科技有限公司 目标识别模型训练方法、装置、设备以及存储介质
CN111626119B (zh) * 2020-04-23 2023-09-01 北京百度网讯科技有限公司 目标识别模型训练方法、装置、设备以及存储介质
CN111626490A (zh) * 2020-05-20 2020-09-04 南京航空航天大学 一种基于对抗学习的多任务城市时空预测方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170242972A1 (en) * 2016-02-19 2017-08-24 International Business Machines Corporation Method for proactive comprehensive geriatric risk screening
US20190130212A1 (en) * 2017-10-30 2019-05-02 Nec Laboratories America, Inc. Deep Network Embedding with Adversarial Regularization
CN110993119A (zh) * 2020-03-04 2020-04-10 同盾控股有限公司 基于人口迁移的疫情预测方法、装置、电子设备及介质
CN111128399A (zh) * 2020-03-30 2020-05-08 广州地理研究所 一种基于人流密度的流行病疫情风险等级评估方法
CN111768873A (zh) * 2020-06-03 2020-10-13 中国地质大学(武汉) 一种covid-19实时风险预测方法
CN112508300A (zh) * 2020-12-21 2021-03-16 北京百度网讯科技有限公司 建立风险预测模型的方法、区域风险预测方法及对应装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4040353A4

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115935265A (zh) * 2023-03-03 2023-04-07 支付宝(杭州)信息技术有限公司 训练风险识别模型的方法、风险识别方法及对应装置

Also Published As

Publication number Publication date
CN112508300A (zh) 2021-03-16
KR20220093046A (ko) 2022-07-05
EP4040353B1 (en) 2023-07-26
JP2023510665A (ja) 2023-03-15
CN112508300B (zh) 2023-04-18
EP4040353A1 (en) 2022-08-10
US20220398465A1 (en) 2022-12-15
EP4040353A4 (en) 2022-08-10

Similar Documents

Publication Publication Date Title
WO2022134480A1 (zh) 建立风险预测模型的方法、区域风险预测方法及对应装置
WO2023065545A1 (zh) 风险预测方法、装置、设备及存储介质
US20220129731A1 (en) Method and apparatus for training image recognition model, and method and apparatus for recognizing image
CN111160471B (zh) 一种兴趣点数据处理方法、装置、电子设备和存储介质
JP7331975B2 (ja) クロスモーダル検索モデルのトレーニング方法、装置、機器、および記憶媒体
US20180025121A1 (en) Systems and methods for finer-grained medical entity extraction
EP4064277A1 (en) Method and apparatus for training speech recognition model, device and storage medium
US20230186607A1 (en) Multi-task identification method, training method, electronic device, and storage medium
WO2022213717A1 (zh) 模型训练方法、行人再识别方法、装置和电子设备
WO2022252843A1 (zh) 时空数据处理模型的训练方法、装置、设备及存储介质
US20220284807A1 (en) Method of predicting traffic volume, electronic device, and medium
US20230162087A1 (en) Federated learning method, electronic device, and storage medium
EP4137966A1 (en) Method and apparatus of extracting table information, electronic device and storage medium
WO2023184777A1 (zh) 更新兴趣点poi状态的方法、装置、设备、介质及产品
CN113641805A (zh) 结构化问答模型的获取方法、问答方法及对应装置
CN113553412A (zh) 问答处理方法、装置、电子设备和存储介质
Zook et al. Big data and the city
Chen et al. KE-CNN: A new social sensing method for extracting geographical attributes from text semantic features and its application in Wuhan, China
CN113590777A (zh) 文本信息处理方法、装置、电子设备和存储介质
CN112784591A (zh) 数据的处理方法、装置、电子设备和存储介质
CN114417974B (zh) 模型训练方法、信息处理方法、装置、电子设备和介质
Yao et al. Predicting mobile users' next location using the semantically enriched geo-embedding model and the multilayer attention mechanism
CN116206289A (zh) 一种跨域司机疲劳驾驶检测方法、装置、终端及存储介质
Shams et al. Deep learning-based spatial analytics for disaster-related tweets: an experimental study
CN114638308A (zh) 一种获取对象关系的方法、装置、电子设备和存储介质

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021576944

Country of ref document: JP

Kind code of ref document: A

Ref document number: 2021820072

Country of ref document: EP

Effective date: 20211216

NENP Non-entry into the national phase

Ref country code: DE