WO2022127057A1 - 气象预警文本处理方法、相关装置及计算机程序产品 - Google Patents

气象预警文本处理方法、相关装置及计算机程序产品 Download PDF

Info

Publication number
WO2022127057A1
WO2022127057A1 PCT/CN2021/100525 CN2021100525W WO2022127057A1 WO 2022127057 A1 WO2022127057 A1 WO 2022127057A1 CN 2021100525 W CN2021100525 W CN 2021100525W WO 2022127057 A1 WO2022127057 A1 WO 2022127057A1
Authority
WO
WIPO (PCT)
Prior art keywords
warning
weather
meteorological
elements
information
Prior art date
Application number
PCT/CN2021/100525
Other languages
English (en)
French (fr)
Inventor
张亦鹏
秦铎浩
刘明浩
Original Assignee
北京百度网讯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京百度网讯科技有限公司 filed Critical 北京百度网讯科技有限公司
Priority to EP21827554.3A priority Critical patent/EP4040329A4/en
Priority to US17/646,665 priority patent/US20220121812A1/en
Publication of WO2022127057A1 publication Critical patent/WO2022127057A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Definitions

  • This application relates to the technical field of data processing, in particular to the technical fields of artificial intelligence such as natural language processing, cloud services, and computer vision, and in particular, to a weather warning text processing method, device, electronic device, computer-readable storage medium, and computer program product.
  • All local meteorological bureaus are required to include at least six key weather warnings in the release of weather warning information: "issue unit”, “release time”, “warning category”, “warning level”, “warning time limit” and “warning area” elements.
  • the embodiments of the present application provide a weather warning text processing method, apparatus, electronic device, computer-readable storage medium, and computer program product.
  • an embodiment of the present application proposes a weather warning text processing method, including: obtaining a weather warning text to be processed; extracting each actual weather warning element from the weather warning text to be processed by using a preset element matching template, the The element matching template is clustered from the context of the sample meteorological warning elements; the actual meteorological warning elements are normalized, and the obtained normalized early warning elements are combined in a preset order to obtain key meteorological warning texts.
  • an embodiment of the present application proposes a weather warning text processing device, including: a weather warning text acquisition unit, configured to obtain a weather warning text to be processed; a weather warning element extraction unit, configured to use preset elements
  • the matching template extracts each actual meteorological warning element from the to-be-processed meteorological warning text, and the element matching template is clustered from the context of the sample meteorological warning element; the normalization processing and combination unit is configured to compare each actual meteorological warning element Perform normalization processing, and combine the obtained normalized early warning elements in a preset order to obtain key meteorological early warning texts.
  • an embodiment of the present application provides an electronic device, the electronic device includes: at least one processor; and a memory connected in communication with the at least one processor; wherein the memory stores instructions executable by the at least one processor , the instruction is executed by at least one processor, so that the at least one processor can implement the weather warning text processing method described in any implementation manner of the first aspect when executed.
  • an embodiment of the present application provides a non-transitory computer-readable storage medium storing computer instructions, where the computer instructions are used to enable a computer to implement the weather warning text described in any implementation manner of the first aspect. Approach.
  • an embodiment of the present application provides a computer program product including a computer program, which, when executed by a processor, can implement the weather warning text processing method described in any of the implementation manners of the first aspect.
  • the weather warning text to be processed is obtained; then, the weather warning text to be processed is obtained by using a preset element matching template
  • Each actual meteorological early warning element is extracted from the text, and the element matching template is clustered from the context of the sample meteorological early warning element; then, each actual meteorological early warning element is normalized, and finally each normalized early warning element is obtained.
  • the key weather warning texts are obtained by combining according to the preset order.
  • the element matching template used in the embodiment of the present application is obtained by clustering the element context extracted from the sample meteorological warning elements in advance.
  • Each type of meteorological warning element will have a corresponding element matching template, and the element context is used as the clustering.
  • the processed input data can be combined with the context as much as possible to improve the accuracy of the clustering center, and the application of clustering processing can improve the generalization of the element matching template generated based on the clustering center, making all kinds of weather warning texts to be processed. All of them can be better able to extract accurate meteorological warning elements.
  • FIG. 1 is an exemplary system architecture to which the present application may be applied;
  • FIG. 2 is a flowchart of a method for processing weather warning text provided by an embodiment of the present application
  • FIG. 3 is a flowchart of a method for generating element matching in the weather warning text processing method provided by the embodiment of the present application;
  • FIG. 5 is a schematic flowchart of a method for processing weather warning text in an application scenario provided by an embodiment of the present application
  • FIG. 6 is a schematic diagram of a refinement process for generating an element template in FIG. 5;
  • Fig. 7 is the refinement flow schematic diagram of time normalization in Fig. 5;
  • Fig. 8 is the refinement flow schematic diagram of place name normalization in Fig. 5;
  • FIG. 9 is a structural block diagram of a weather warning text processing device provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an electronic device suitable for executing a weather warning text processing method according to an embodiment of the present application.
  • FIG. 1 illustrates an exemplary system architecture 100 to which embodiments of the weather warning text processing method, apparatus, electronic device, computer-readable storage medium, and computer program product of the present application may be applied.
  • FIG. 1 shows an exemplary system architecture 100 to which embodiments of the subject-verb-object triple generation method, apparatus, electronic device, and computer-readable storage medium of the present application may be applied.
  • the system architecture 100 may include terminal devices 101 , 102 , and 103 , a network 104 and a server 105 .
  • the terminal devices 101, 102, 103 are used to send the weather warning text to be processed to the server 105 through the network 104
  • the network 104 is the communication link for data communication between the terminal devices 101, 102, 103 and the server 105
  • the server 105 is used for generating key weather texts according to the received weather warning texts to be processed.
  • the terminal devices 101, 102, 103 and the server 105 may be hardware or software.
  • the terminal devices 101, 102, 103 are hardware, they can be various electronic devices including smart phones, tablet computers, laptop computers and desktop computers; when the terminal devices 101, 102, 103 are software, they can be It is a single/multiple software/function modules installed in the electronic equipment listed above, which is not specifically limited here.
  • the server 105 is hardware, it can be implemented as a distributed server cluster composed of multiple servers, or it can be implemented as a single server; when the server is software, it can also be implemented as single/multiple software/function modules, which are not detailed here. limited.
  • the above-mentioned purpose can be realized by the application installed on the terminal equipment 101, 102, 103 and the server 105, such as the weather warning text processing application (which can be further divided into the client part and the server part), in addition, in order to ensure the weather warning as much as possible.
  • the weather warning text processing application which can be further divided into the client part and the server part
  • other applications may also be installed on the terminal devices 101 , 102 , 103 and the server 105 , such as fault diagnosis applications, communication applications for communicating with management or operation and maintenance personnel, and so on.
  • the server 105 installed with the application can achieve the following effects when running the application: First, obtain the information from the terminal devices 101 , 102 and 103 through the network 104 The meteorological warning text to be processed; then, each actual meteorological warning element is extracted from the meteorological warning text to be processed by using a preset element matching template, and the element matching template is clustered from the context of the sample meteorological warning element; The actual meteorological early warning elements are normalized, and finally the obtained normalized early warning elements are combined in a preset order to obtain key meteorological early warning texts.
  • the server 105 can also push the generated key weather warning text to the corresponding user, timely reminding the corresponding user to take preventive measures in advance.
  • the weather warning text to be processed can be obtained in real time from the terminal devices 101, 102 and 103 through the network 104, and can also be obtained from other websites that record the same or similar text information, such as the National Weather Service. , Climb the official website of the local weather bureau.
  • the previously acquired to-be-processed weather warning texts can also be pre-stored locally on the server 105 in various ways, so that when the server 105 detects that the data has been stored locally, it is possible to select the data based on the local data. Subsequent processing steps.
  • the exemplary system architecture 100 may also not include the end devices 101 , 102 , 103 and the network 104 .
  • the weather warning text processing methods provided by the subsequent embodiments of the present application are generally executed by the server 105 capable of processing this type of data (that is, a device that stores important parameters such as element matching templates, normalization rules, combination order, etc.), and the corresponding
  • the weather warning text processing device is generally also set in the server 105 .
  • the weather warning text processing apparatus may also be provided in the terminal devices 101 , 102 and 103 . In such a case, the example system architecture 100 may also not include the server 105 and the network 104 .
  • terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.
  • FIG. 2 is a flowchart of a method for processing weather warning text provided by an embodiment of the present application, wherein the process 200 includes the following steps:
  • Step 201 Obtaining the weather warning text to be processed
  • the purpose of this step is to obtain the weather warning text to be processed by the execution body of the weather warning text processing method (for example, the server 105 shown in FIG. 1 ).
  • the weather warning text to be processed can be received in real time from terminal devices (for example, terminal devices 101, 102, 103 shown in FIG. 1 ), and the terminal device can be a weather warning release device of a weather bureau, or an information release interface. .
  • Warning text it may not be possible to directly obtain the weather warning text to be processed, but pictures or charts with corresponding information are recorded. Warning text.
  • Step 202 Extract each actual meteorological warning element from the to-be-processed meteorological warning text by using a preset element matching template
  • this step is aimed at extracting various types of actual weather warning elements contained in the weather warning text to be processed by the above-mentioned execution subject using a preset element matching template.
  • This application requires at least six types of actual meteorological early warning elements to be extracted from it, namely "issuer”, “release time”, “warning category”, “warning level”, “warning time limit” and “warning area”, such as "Beijing Meteorological Bureau” ", “released at 12:15”, “rainstorm warning”, “orange”, "will continue from 16:00 today to 19:00 today", “cover most urban areas in Haidian and Dongcheng”.
  • the element matching template is obtained from the context of the sample meteorological warning elements through clustering processing
  • the sample meteorological warning elements are extracted from the sample meteorological warning information
  • the context information means that the object of clustering processing does not only include various types of sample meteorological information.
  • the early warning elements also include relevant context information of the meteorological early warning elements of the sample, so that a more accurate cluster center can be obtained through clustering processing through the increased context information, thereby generating a more accurate element matching template.
  • Step 203 Perform normalization processing on each actual meteorological warning element, and combine the obtained normalized early warning elements in a preset order to obtain key meteorological warning texts.
  • step 202 the purpose of this step is that the above-mentioned executive body expresses the extracted actual meteorological early warning elements through normalization processing, and then combines the uniformly expressed actual meteorological early warning elements in a certain order, and finally Obtain key meteorological warning texts that only contain key information and are uniformly expressed.
  • the normalization processing used for unified expression can adopt different normalization methods according to different types of meteorological warning elements.
  • the same preset time representation format can be used for "release time” and “warning time limit” that contain time.
  • the settings should be replaced with officially recognized names, etc.
  • the "release time” is often a moment
  • the "warning time limit” is often a time period
  • the corresponding preset time representation format and time period representation format can also be used respectively, which are not specifically limited here.
  • the element matching template used is obtained by clustering the element context extracted from the sample meteorological warning elements in advance, and each type of meteorological warning element will have corresponding element matching.
  • Template using the element context as the input data of the clustering process can improve the accuracy of the clustering center as much as possible in combination with the context, and the application of clustering processing improves the generalization of the element matching template generated based on the clustering center. All kinds of weather warning texts to be processed can be better extracted and accurate weather warning elements.
  • the key weather warning text can also be pushed in time. All users who may appear in the actual alert area.
  • the above-mentioned execution body can determine the information push area according to the warning area elements included in the key weather warning text, and then push the key weather warning text to users located in the information push area through a preset path.
  • the push range can also be accurate to all users who may appear in the actual early warning area within the actual early warning time limit. For example, in the case of obtaining authorization, it can be determined whether to newly enter or leave the actual early warning area within the actual early warning period by reading the user's preset appearance plan or the currently formulated travel plan.
  • the present application also provides two different element matching template generation methods through FIG. 3 and FIG. 4 , respectively. flow chart.
  • the process 300 shown in the flowchart shown in FIG. 3 includes the following steps:
  • Step 301 Obtain sample data from an authoritative issuing agency of weather warning information
  • the purpose of this step is to pre-acquire the sample weather warning data for extracting the cluster center by the above-mentioned execution body.
  • the sample weather warning data are obtained from authoritative organizations that issue weather warning information, such as the State Meteorological Administration, local weather bureaus, and so on.
  • Step 302 Obtain location information of various types of weather warning elements included in the sample data
  • this step aims to obtain the position information of various types of weather warning elements in the corresponding sample data by the above-mentioned execution body, and the position information can be the start and end positions of each actual weather warning element in the sample data. , it can also be an overlay highlight mark, etc.
  • the location information of the meteorological warning elements in the sample data is usually marked by experienced technicians, so as to ensure the accuracy of the matching template for the elements obtained by the clustering process as much as possible.
  • Step 303 Extract the context information of the weather warning element of the corresponding type according to the location information
  • step 302 the purpose of this step is to extract the context information of the corresponding type of weather warning elements according to the location information by the above-mentioned executive body, that is, based on the location information of the weather warning elements, extract some more forward and backward textual context.
  • Step 304 Clustering the context information of the meteorological warning elements of the same type according to the preset number of cluster centers to obtain cluster centers;
  • step 303 the purpose of this step is to perform clustering processing on the context information of the same type of weather warning elements according to the preset number of cluster centers by the above-mentioned execution body to obtain the cluster centers.
  • the number of cluster centers can be set according to actual needs.
  • the sample data volume is large enough and the computing power is sufficient, in order to be as accurate as possible, the number of cluster centers can be set to a large number, so as to be as accurate as possible.
  • the cluster centers with higher discrimination accuracy are obtained.
  • Step 305 Generate element matching templates of corresponding types of weather warning elements according to the obtained cluster centers.
  • this step is aimed at generating the element matching template of the corresponding type of weather warning elements according to the obtained cluster centers by the above-mentioned executing subject.
  • each type of weather warning element will correspond to multiple cluster centers, and according to each cluster center, an element matching template can be generated, and finally multiple element matching templates corresponding to each type of weather warning element can be obtained.
  • the process 400 shown in the flowchart shown in FIG. 4 includes the following steps:
  • Step 401 Obtain sample data from an authoritative issuing agency of weather warning information
  • Step 402 Obtain location information of various types of weather warning elements included in the sample data
  • Step 403 Extract the context information of the weather warning element of the corresponding type according to the location information
  • Step 404 For the context information whose actual length is less than the preset length, use preset characters to fill in bits until the length after the complement of the above, below or above and below information is the predetermined length;
  • this embodiment also considers that some meteorological warning elements located at the beginning or end of the sample data may not be able to extract context, context or context information of sufficient length, so the actual length is less than the preset length.
  • Context information using preset characters to fill in bits, until the length after the complement of the above, below or above and below information is the preset length.
  • each weather warning element may be required to have 20 characters of above and below, with a total length of 40 characters.
  • Step 405 Convert the context information whose length is a preset length after the complement into a context vector
  • this step converts the context information in the form of text into a vector form that is more convenient for clustering processing, so as to improve the efficiency of subsequent processing.
  • Step 406 Clustering the context vectors of the weather warning elements of the same type according to the preset number of cluster centers to obtain cluster centers;
  • Step 407 Generate a regular expression list of corresponding types of weather warning elements according to the obtained cluster centers.
  • a regular expression is specifically selected as the element matching template, and the regular expressions respectively generated by different cluster centers of the corresponding types of meteorological warning elements are recorded in the regular expression list.
  • the embodiment shown in FIG. 4 further considers the question of whether and how the lengths of the context information of different weather warning elements are unified.
  • the context information is converted into vector form, and a regular expression with wider application range and more convenient editing is finally selected as the specific element matching template.
  • the present application also considers how to utilize the generated key weather warning text more effectively, so as to maximize the value of the data.
  • some possible laws can be mined from the statistical level according to the correlation between the weather warning elements in the key weather warning texts that only contain key information, and then used for prediction, etc., or simply used for annual statistics, etc. .
  • a processing method that includes but is not limited to: according to each key meteorological warning text issued within a preset time period, count the actual number of preset target meteorological warning elements or combination of target meteorological warning elements, and then generate according to the actual number. Statistics and forecast results.
  • the main element extraction task performed by the meteorological warning information processing system can be divided into using element (matching) templates to locate each meteorological warning element in the weather warning text to be processed and each meteorological warning element extracted from the positioning.
  • the warning elements are normalized. Among them, normalization is mainly completed through rules, establishment of dictionaries and text similarity calculation.
  • each type of weather warning element use all regular expressions in the corresponding element (matching) template list of this type of weather warning element to perform a matching test on the weather warning text used for testing in turn. If the matching is successful, the feature positioning is completed. If no template matching is successful, the feature positioning fails.
  • the time in the weather warning information includes the release time and the warning time limit (valid time interval):
  • the format of the warning release time is unified, which is always in the format of xx: xx on xx, xx, xx, xx, such as 14:25 on October 26, 2020. It is only necessary to remove the space characters in the release time string to complete the normalization of the warning release time. If the warning release time is not located, the current time is taken as the warning release time;
  • the warning types (such as typhoon, rainstorm, blizzard) are accurate and unambiguous, and do not need to be normalized.
  • a meteorological early warning level dictionary is constructed with the early warning level as the unit.
  • the entry in the warning level dictionary is the mapping from specific text to the standard expression of warning level, such as "general warning” -> “blue warning”, “IV warning” -> “blue warning”, “4 warning” -> “Blue Alert”, “Blue Alert” -> “Blue Alert”.
  • the warning level dictionary With the help of the warning level dictionary, the normalization of warning information levels can be quickly completed.
  • the official website of the Ministry of Civil Affairs of China captures the tree-like relationships, codes and names of administrative divisions above the county level, establishes an attribution table of administrative divisions, records each administrative division, and records its name, administrative division level ( provinces, cities, counties, etc.) and the codes of all its directly subordinate administrative divisions.
  • the element text contains list element separator characters such as ",”, ",”, “and”, "and”, and the list of place names is extracted from the element text according to the separator. For each place name, perform 5) to normalize. Replace the normalized result of place names into the feature text by location, and use the replaced feature text as the normalized result of the warning area.
  • the normalized weather warning elements will be output, and these weather warning elements can be combined in a certain order subsequently, and the obtained key weather warning texts will be pushed to the corresponding users.
  • the present application provides an embodiment of a weather warning text processing device
  • the device embodiment corresponds to the method embodiment shown in FIG. 9
  • the weather warning text processing apparatus 900 in this embodiment may include: a weather warning text acquisition unit 901 , a weather warning element extraction unit 902 , and a normalization processing and combination unit 903 .
  • the weather warning text obtaining unit 901 is configured to obtain the weather warning text to be processed
  • the weather warning element extracting unit 902 is configured to extract each actual weather warning element from the weather warning text to be processed by using a preset element matching template , the element matching template is clustered from the context of the sample meteorological warning elements
  • the normalization processing and combination unit 903 is configured to normalize each actual meteorological early warning element, and combine the obtained normalized early warning elements
  • the key weather warning texts are obtained by combining according to the preset order.
  • the weather warning text processing device 900 in the weather warning text processing device 900: the weather warning text acquisition unit 901, the weather warning element extraction unit 902, the normalization processing and the combination unit 903 for specific processing and the technical effects brought about by reference respectively FIG. 2 corresponds to the related descriptions of steps 201-203 in the embodiment, which are not repeated here.
  • the weather warning text processing apparatus 900 may further include:
  • the information push area determination unit is configured to determine the information push area according to the warning area elements contained in the key meteorological warning text;
  • the early warning information push unit is configured to push key weather warning texts to users located in the information push area through a preset path.
  • the weather warning text processing apparatus 900 may further include an element matching template generating unit, and the element matching template generating unit may include:
  • the sample data acquisition sub-unit is configured to acquire sample data from an authoritative issuing agency of meteorological warning information
  • the element location information acquisition subunit is configured to acquire the location information of various types of weather warning elements contained in the sample data
  • the context information extraction subunit is configured to extract the context information of the corresponding type of weather warning elements according to the location information
  • the clustering processing and element matching template generation sub-unit is configured to cluster the context information of the same type of weather warning elements according to the preset number of cluster centers, and generate corresponding types of weather warnings according to the obtained cluster centers.
  • Feature matching template for features.
  • the element matching template generating unit may further include:
  • the filling and filling subunit is configured to use preset characters to fill in the above or below information of the corresponding weather warning element in response to the fact that the actual length of the above or below information is less than the preset length, until the above, The length after the complement of the following or the above and the following information is a preset length;
  • an expression conversion subunit configured to convert the context information whose length is a preset length after the complement into a context vector
  • the clustering processing and element matching template generation subunit includes a clustering processing module configured to perform clustering processing on the context information of the same type of meteorological warning elements according to the preset number of cluster centers, and the clustering processing module is further configured to :
  • the context vectors of the same type of meteorological warning elements are clustered according to the preset number of cluster centers.
  • the clustering processing and element matching template generating subunit includes an element matching template generating module configured to generate element matching templates of corresponding types of weather warning elements according to the obtained cluster centers
  • the feature matching template generation module can be further configured as:
  • a regular expression list of corresponding types of weather warning elements is generated according to the obtained cluster centers; wherein, the regular expressions respectively generated by different cluster centers of the corresponding types of weather warning elements are recorded in the regular expression list.
  • the weather warning text processing apparatus 900 may further include:
  • the target element occurrence number counting unit is configured to count the actual number of preset target weather warning elements or target weather warning element combinations according to each key weather warning text issued within a preset time period;
  • the statistics and forecast result generating unit is configured to generate statistics and forecast results according to the actual quantity.
  • the element matching template used by the weather warning text processing device is obtained by clustering the element context extracted from the sample weather warning elements in advance, Each type of meteorological warning element will have a corresponding element matching template.
  • Using the element context as the input data for clustering processing can improve the accuracy of the clustering center as much as possible in combination with the contextual context, and the application of clustering processing is improved.
  • Based on the generalization of the element matching template generated by the cluster center accurate weather warning elements can be better extracted from all kinds of weather warning texts to be processed.
  • the present application further provides an electronic device, a readable storage medium, and a computer program product.
  • FIG. 10 shows a schematic block diagram of an example electronic device 1000 that may be used to implement embodiments of the present application.
  • Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
  • Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the application described and/or claimed herein.
  • the device 1000 includes a computing unit 1001 that can be executed according to a computer program stored in a read only memory (ROM) 1002 or a computer program loaded from a storage unit 1008 into a random access memory (RAM) 1003 Various appropriate actions and handling. In the RAM 1003, various programs and data required for the operation of the device 1000 can also be stored.
  • the computing unit 1001, the ROM 1002, and the RAM 1003 are connected to each other through a bus 1004.
  • An input/output (I/O) interface 1005 is also connected to the bus 1004 .
  • Various components in the device 1000 are connected to the I/O interface 1005, including: an input unit 1006, such as a keyboard, mouse, etc.; an output unit 1007, such as various types of displays, speakers, etc.; a storage unit 1008, such as a magnetic disk, an optical disk, etc. ; and a communication unit 1009, such as a network card, a modem, a wireless communication transceiver, and the like.
  • the communication unit 1009 allows the device 1000 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
  • Computing unit 1001 may be various general-purpose and/or special-purpose processing components with processing and computing capabilities. Some examples of computing units 1001 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various specialized artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc.
  • the computing unit 1001 executes the various methods and processes described above, such as the weather warning text processing method.
  • the weather warning text processing method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1008 .
  • part or all of the computer program may be loaded and/or installed on device 1000 via ROM 1002 and/or communication unit 1009.
  • the computer program When the computer program is loaded into RAM 1003 and executed by computing unit 1001, one or more steps of the above-described weather warning text processing method can be performed.
  • the computing unit 1001 may be configured to execute the weather warning text processing method by any other suitable means (eg, by means of firmware).
  • Various implementations of the systems and techniques described herein above may be implemented in digital electronic circuitry, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips system (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof.
  • FPGAs field programmable gate arrays
  • ASICs application specific integrated circuits
  • ASSPs application specific standard products
  • SOC systems on chips system
  • CPLD load programmable logic device
  • computer hardware firmware, software, and/or combinations thereof.
  • These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that
  • the processor which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.
  • Program code for implementing the methods of the present application may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, performs the functions/functions specified in the flowcharts and/or block diagrams. Action is implemented.
  • the program code may execute entirely on the machine, partly on the machine, partly on the machine and partly on a remote machine as a stand-alone software package or entirely on the remote machine or server.
  • a machine-readable medium may be a tangible medium that may contain or store the program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer.
  • a display device eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
  • a keyboard and pointing device eg, a mouse or trackball
  • Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.
  • the systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system.
  • the components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
  • a computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.
  • the server can be a cloud server, also known as a cloud computing server or a cloud host. It is a host product in the cloud computing service system to solve the management difficulties in traditional physical host and virtual private server (VPS, Virtual Private Server) services. Large, weak business expansion defects.
  • VPN Virtual Private Server
  • the element matching template used in the technical solution provided by this embodiment is obtained by clustering the element context extracted from the sample weather warning elements in advance.
  • Each type of weather warning element will have a corresponding element matching template, using
  • the element context can be combined with the context as much as possible to improve the accuracy of the clustering center, and the application of clustering processing can improve the generalization of the element matching template generated based on the clustering center, making all kinds of The weather warning texts to be processed can be well extracted and accurate weather warning elements can be extracted.
  • steps may be reordered, added or deleted using the various forms of flow shown above.
  • each step described in the present disclosure can be executed in parallel, can be executed sequentially, or can be executed in a different order, as long as the desired results of the technical solutions disclosed in the present application can be achieved, there is no limitation herein.

Abstract

本申请实施例公开了气象预警文本处理方法、装置、电子设备、计算机可读存储介质及计算机程序产品,涉及自然语言处理、云服务、计算机视觉等人工智能技术领域。该方法的一具体实施方式包括:获取待处理气象预警文本;利用预设的要素匹配模板从待处理气象预警文本中提取出各实际气象预警要素,要素匹配模板从样本气象预警要素的上下文中聚类得到;对各实际气象预警要素进行归一化处理,并将得到的各归一化预警要素按照预设顺序组合得到关键气象预警文本。应用该实施方式可提升气象预警要素提取的准确性和对各式待处理气象预警文本的泛化处理能力。

Description

气象预警文本处理方法、相关装置及计算机程序产品
本专利申请要求于2020年12月17日提交的、申请号为202011492994.5、发明名称为“气象预警文本处理方法、相关装置及计算机程序产品”的中国专利申请的优先权,这些申请的全文以引用的方式并入本申请中。
技术领域
本申请涉及数据处理技术领域,具体涉及自然语言处理、云服务、计算机视觉等人工智能技术领域,尤其涉及气象预警文本处理方法、装置、电子设备、计算机可读存储介质及计算机程序产品。
背景技术
各地气象局在发布气象预警信息中都被要求至少包括“发布单位”、“发布时间”、“预警类别”、“预警级别”、“预警时效”和“预警区域”共6种关键的气象预警要素。
但不同气象局所生成的气象预警信息往往仍在描述或表达上不统一。
发明内容
本申请实施例提出了一种气象预警文本处理方法、装置、电子设备、计算机可读存储介质及计算机程序产品。
第一方面,本申请实施例提出了一种气象预警文本处理方法,包括:获取待处理气象预警文本;利用预设的要素匹配模板从待处理气象预警文本中提取出各实际气象预警要素,该要素匹配模板从样本气象预警要素的上下文中聚类得到;对各实际气象预警要素进行归一化处理,并将得到的各归一化预警要素按照预设顺序组合得到关键气象预警文本。
第二方面,本申请实施例提出了一种气象预警文本处理装置,包括:气象预警文本获取单元,被配置成获取待处理气象预警文本;气象预警要素提取单元,被配置成利用预设的要素匹配模板从待处理气象预警 文本中提取出各实际气象预警要素,该要素匹配模板从样本气象预警要素的上下文中聚类得到;归一化处理及组合单元,被配置成对各实际气象预警要素进行归一化处理,并将得到的各归一化预警要素按照预设顺序组合得到关键气象预警文本。
第三方面,本申请实施例提供了一种电子设备,该电子设备包括:至少一个处理器;以及与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,该指令被至少一个处理器执行,以使至少一个处理器执行时能够实现如第一方面中任一实现方式描述的气象预警文本处理方法。
第四方面,本申请实施例提供了一种存储有计算机指令的非瞬时计算机可读存储介质,该计算机指令用于使计算机执行时能够实现如第一方面中任一实现方式描述的气象预警文本处理方法。
第五面,本申请实施例提供了一种包括计算机程序的计算机程序产品,该计算机程序在被处理器执行时能够实现如第一方面中任一实现方式描述的气象预警文本处理方法。
本申请实施例提供的气象预警文本处理方法、装置、电子设备、计算机可读存储介质及计算机程序产品,首先,获取待处理气象预警文本;然后,利用预设的要素匹配模板从待处理气象预警文本中提取出各实际气象预警要素,该要素匹配模板从样本气象预警要素的上下文中聚类得到;接着,对各实际气象预警要素进行归一化处理,最后将得到的各归一化预警要素按照预设顺序组合得到关键气象预警文本。
本申请实施例所使用的要素匹配模板是预先从样本气象预警要素抽取出的要素上下文经聚类处理得到,每种类型的气象预警要素将对应有相应的要素匹配模板,使用要素上下文作为聚类处理的输入数据得以尽可能的结合上下文语境提升聚类中心的准确性,聚类处理的应用则提升了基于聚类中心生成的要素匹配模板的泛化性,使得各式待处理气象预警文本均能够被较好的提取出准确的气象预警要素。
应当理解,本部分所描述的内容并非旨在标识本申请的实施例的关键或重要特征,也不用于限制本申请的范围。本申请的其它特征将通过以下的说明书而变得容易理解。
附图说明
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:
图1是本申请可以应用于其中的示例性系统架构;
图2为本申请实施例提供的一种气象预警文本处理方法的流程图;
图3为本申请实施例提供的气象预警文本处理方法中一种要素匹配生成方法的流程图;
图4为本申请实施例提供的气象预警文本处理方法中另一种要素匹配生成方法的流程图;
图5为本申请实施例提供的在一应用场景下的气象预警文本处理方法的流程示意图;
图6为图5中生成要素模板的细化流程示意图;
图7为图5中时间归一化的细化流程示意图;
图8为图5中地名归一化的细化流程示意图;
图9为本申请实施例提供的一种气象预警文本处理装置的结构框图;
图10为本申请实施例提供的一种适用于执行气象预警文本处理方法的电子设备的结构示意图。
具体实施方式
以下结合附图对本申请的示范性实施例做出说明,其中包括本申请实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本申请的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
图1示出了可以应用本申请的气象预警文本处理方法、装置、电子设备、计算机可读存储介质及计算机程序产品的实施例的示例性系 统架构100。
图1示出了可以应用本申请的主谓宾三元组生成方法、装置、电子设备及计算机可读存储介质的实施例的示例性系统架构100。
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。其中,终端设备101、102、103用于通过网络104向服务器105发送用于待处理气象预警文本,网络104为终端设备101、102、103与服务器105之间进行数据通信的通信链路,服务器105则用于根据接收到的待处理气象预警文本生成关键气象文本。
具体的,终端设备101、102、103和服务器105可以是硬件,也可以是软件。当终端设备101、102、103为硬件时,可以是包括智能手机、平板电脑、膝上型便携计算机和台式计算机在内的各种电子设备;当终端设备101、102、103为软件时,可以是安装在上述所列举的电子设备中的单/多个软件/功能模块,在此不做具体限定。当服务器105为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器;服务器为软件时,也可以实现成单/多个软件/功能模块,在此不做具体限定。
上述目的可通过安装在终端设备101、102、103和服务器105上的应用来实现,例如气象预警文本处理应用(可进一步分为客户端的部分和服务端的部分),另外,为尽可能保障气象预警文本处理操作的持续稳定运行,终端设备101、102、103和服务器105上还可以安装其他应用,例如故障诊断类应用、用于与管理或运维人员进行通信的通信类应用等等。
以提供针对气象预警文本处理服务的气象预警文本处理应用为例,安装有该应用的服务器105可在运行该应用时可实现如下效果:首先,通过网络104从终端设备101、102、103处获取待处理气象预警文本;然后,利用预设的要素匹配模板从待处理气象预警文本中提取出各实际气象预警要素,该要素匹配模板从样本气象预警要素的上下文中聚类得到;接着,对各实际气象预警要素进行归一化处理,最后将得到的各归一化预警要素按照预设顺序组合得到关键气象预警文本。
进一步的,服务器105还可以将生成的关键气象预警文本推送给 相应的用户,以及时的提醒相应用户提前做出防范举措。
需要指出的是,待处理气象预警文本除可以从终端设备101、102、103通过网络104实时获取到之外,也可以从其它记载有相同或类似文本信息的网站爬取得到,例如国家气象局、各地气象局的官方网站上爬取得到。除实时获取的方式之外,也可以通过各种方式将之前获取到的待处理气象预警文本预先存储在服务器105本地,以便在服务器105检测到本地已经存储有这些数据时可选择基于本地数据进行后续处理步骤。在此种情况下,示例性系统架构100也可以不包括终端设备101、102、103和网络104。
本申请后续各实施例所提供的气象预警文本处理方法一般由拥有处理该类型数据能力的服务器105(即存储有要素匹配模板、归一化规则、组合顺序等重要参数的设备)来执行,相应地,气象预警文本处理装置一般也设置于服务器105中。但同时也需要指出的是,在某些特定的终端设备也具有满足要求的处理能力和运算资源时,也可以将这些特定的终端设备完成上述本交由服务器105做的各项运算,进而输出与服务器105同样的结果。相应的,气象预警文本处理装置也可以设置于终端设备101、102、103中。在此种情况下,示例性系统架构100也可以不包括服务器105和网络104。
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。
请参考图2,图2为本申请实施例提供的一种气象预警文本处理方法的流程图,其中流程200包括以下步骤:
步骤201:获取待处理气象预警文本;
本步骤旨在由气象预警文本处理方法的执行主体(例如图1所示的服务器105)获取待处理气象预警文本。
其中,待处理气象预警文本可以实时的从终端设备(例如图1所示的终端设备101、102、103)接收到,终端设备可以是某气象局的气象预警发布设备,也可以是信息发布接口。
某些情况下可能无法直接获取到待处理气象预警文本,而是记载有 相应信息的图片或图表等,此时可以利用诸如光学字符识别技术、结构化信息提取技术从中提取出相应的待处理气象预警文本。
步骤202:利用预设的要素匹配模板从待处理气象预警文本中提取出各实际气象预警要素;
在步骤201的基础上,本步骤旨在由上述执行主体利用预先设置的要素匹配模板从待处理气象预警文本中提取出其中包含的各类型实际气象预警要素。本申请至少要求从中提取出“发布单位”、“发布时间”、“预警类别”、“预警级别”、“预警时效”和“预警区域”6种类型的实际气象预警要素,例如“北京气象局”、“于12时15分发布”、“暴雨预警”、“橙色”、“将自今日16时持续至今日19时”、“覆盖海淀、东城大部分城区”。
其中,该要素匹配模板从样本气象预警要素的上下文中经聚类处理得到,样本气象预警要素从样本气象预警信息中抽取得到,上下文信息则指聚类处理的对象并非仅包含各类型的样本气象预警要素,还包含有该样本气象预警要素的相关语境信息,以通过增加的语境信息使得经聚类处理得到更准确的聚类中心,进而生成更加准确的要素匹配模板。
步骤203:对各实际气象预警要素进行归一化处理,并将得到的各归一化预警要素按照预设顺序组合得到关键气象预警文本。
在步骤202的基础上,本步骤旨在由上述执行主体对提取出的各实际气象预警要素通过归一化处理统一表述,然后将表述统一的各实际气象预警要素按照一定的顺序进行组合,最终得到仅包含关键信息、表述统一的关键气象预警文本。
用于统一表述的归一化处理可根据气象预警要素种类的不同,具体采用不同的归一化方式,例如针对包含时间的“发布时间”和“预警时效”可采用相同预设的时间表示格式,而针对包含位置的“发布单位”和“预警区域”则设置应采用官方认可的名称来替换等。当然,还可以考虑到“发布时间”往往是一个时刻,而“预警时效”往往是一个时段,也可以分别采用相应的预设时刻表示格式和时段表示格式,此处不做具体限定。
本申请实施例提供的气象预警文本处理方法,所使用的要素匹配模板是预先从样本气象预警要素抽取出的要素上下文经聚类处理得到, 每种类型的气象预警要素将对应有相应的要素匹配模板,使用要素上下文作为聚类处理的输入数据得以尽可能的结合上下文语境提升聚类中心的准确性,聚类处理的应用则提升了基于聚类中心生成的要素匹配模板的泛化性,使得各式待处理气象预警文本均能够被较好的提取出准确的气象预警要素。
进一步的,在得到表述较待处理气象预警文本更精简、表述更不容易引起歧义、便于用户更快拾取关键信息的关键气象预警文本之后,还可以将该关键气象预警文本及时的推送给出于所有可能出现在实际的预警区域内的用户。具体的,可由上述执行主体根据该关键气象预警文本中包含的预警区域要素确定信息推送区域,然后通过预设路径将关键气象预警文本推送给位于信息推送区域的用户。更进一步的,由于气象预警信息的发布时间往往早于预警时效,因此还可以结合预警时效将推送范围精确至所有可能在实际的预警时效内出现在实际的预警区域的用户。例如可以获取到授权的情况下通过读取用户预先设置的出现计划或当前已经制定的出行规划来判断是否会在实际的预警时效内新进入或离开实际的预警区域。
在上述实施例的基础上,为了尽可能的加深对本申请如何得到要素匹配模板的认识和理解,本申请在此处还分别通过图3和图4提供了两种不同的要素匹配模板生成方法的流程图。
其中图3所示的流程图中示出的流程300包括如下步骤:
步骤301:从气象预警信息的权威发布机构获取样本数据;
本步骤旨在由上述执行主体预先获取到用于提炼聚类中心的样本气象预警数据。为了保证结果的准确性,样本气象预警数据从气象预警信息的权威发布机构获取,例如国家气象总局、各地的气象局等等。
同样的,若无法直接获取到文本形式的样本数据,还可以根据获取到的表示形式(例如图片、表格、图表)采用相应的转换操作,以使最终都得到统一的文本形式的样本数据。
步骤302:获取样本数据中包含的各类型气象预警要素的位置信息;
在步骤301的基础上,本步骤旨在由上述执行主体获取到各类型气象预警要素在相应的样本数据中的位置信息,该位置信息可以是各个实际的气象预警要素在样本数据中的起止位置,也可以是覆盖式的高亮标记等等。样本数据中的气象预警要素的位置信息通常由有经验的技术人员标注得到,以尽可能的保障用于经聚类处理得到的要素匹配模板的准确性。
步骤303:根据位置信息抽取出相应类型的气象预警要素的上下文信息;
在步骤302的基础上,本步骤旨在由上述执行主体根据位置信息抽取出相应类型的气象预警要素的上下文信息,即根据气象预警要素的位置信息的基础上往前和往后多抽取一些前后文语境。
步骤304:将相同类型的气象预警要素的上下文信息按预设的聚类中心数进行聚类处理,得到聚类中心;
在步骤303的基础上,本步骤旨在由上述执行主体将相同类型的气象预警要素的上下文信息按预设的聚类中心数进行聚类处理,得到聚类中心。其中,聚类中心数可以根据实际需求自行设定得到,在样本数据量足够大、运算能力足够的情况下,为了尽可能的准确,可以将聚类中心数设为一个较大的数,以得到区分精度更高的各聚类中心。
步骤305:根据得到的聚类中心生成相应类型的气象预警要素的要素匹配模板。
在步骤304的基础上,本步骤旨在由上述执行主体根据得到的聚类中心生成相应类型的气象预警要素的要素匹配模板。
即每种类型的气象预警要素都会对应由多个聚类中心,而根据每个聚类中心都可以生成一个要素匹配模板,最终得到与每种类型的气象预警要素对应的多个要素匹配模板。
在实际使用要素匹配模板从待处理气象预警文本中提取相应实际的气象预警要素时,可以通过并行或串行的匹配模板尝试方式,具体选用哪种可根据实际情况灵活选择。
其中图4所示的流程图中示出的流程400包括如下步骤:
步骤401:从气象预警信息的权威发布机构获取样本数据;
步骤402:获取样本数据中包含的各类型气象预警要素的位置信息;
步骤403:根据位置信息抽取出相应类型的气象预警要素的上下文信息;
步骤404:对实际长度不足预设长度的上下文信息,使用预设字符进行补位填充,直至上文、下文或上文和下文信息的补位后长度为预设长度;
区别于图3中的流程300,本实施例还考虑到某些位于样本数据开头或结尾的气象预警要素可能抽取不到足够长度的上下、下文或上下文信息,因此针对实际长度不足预设长度的上下文信息,使用预设字符进行补位填充,直至上文、下文或上文和下文信息的补位后长度为预设长度。例如可要求每个气象预警要素各有20个字符的上文和下文,总长度为40个字符。
应当理解的是,统一的长度有助于尽可能的消除后续不同处理结果之间的差异。
步骤405:将补位后长度为预设长度的上下文信息转换为上下文向量;
在步骤404的基础上,本步骤将文本形式的上下文信息转换为更便于聚类处理的向量形式,以便于提升后续处理效率。
应当理解的是,表达形式的转换不应导致信息内容的丢失,同时也可以换用其它便于进行聚类处理的形式。
步骤406:将相同类型的气象预警要素的上下文向量按预设的聚类中心数进行聚类处理,得到聚类中心;
步骤407:根据得到的聚类中心生成相应类型的气象预警要素的正则表达式列表。
本步骤在具体选用正则表达式作为要素匹配模板,正则表达式列表中记录有相应类型的气象预警要素的不同聚类中心分别生成的各正则表达式。
相较于图3所示的实施例,图4所示的实施例进一步的考虑了不同 气象预警要素的上下文信息的长度是否统一以及如何统一长度的问题,还进一步为了增加处理效率将文本形式的上下文信息转换为向量形式,最终选用了适用范围更广、编辑更方便的正则表达式来作为具体的要素匹配模板。
在上述任意实施例的基础上,本申请还考虑到如何更有效的利用生成的关键气象预警文本,以尽可能的发挥数据的价值。例如可以根据仅包含关键信息的关键气象预警文本中的气象预警要素之间的关联性从统计层面来挖掘一些可能存在的规律,进而用于预测等等,也可以简单的用于年度统计等等。一种包括且不限于的处理方式为:根据预设时间段内发布的各关键气象预警文本,统计得到存在预设的目标气象预警要素或目标气象预警要素组合的实际数量,然后根据实际数量生成统计和预测结果。
为加深理解,本申请还结合一个具体应用场景,给出了一种具体的实现方案,请参见如图5所示的总体流程示意图:
如图5所示,该气象预警信息处理系统所主要进行的要素提取任务可拆分为用要素(匹配)模板来定位待处理气象预警文本中的各气象预警要素和对定位抽取出的各气象预警要素进行归一化。其中,归一化主要是通过规则、建立词典和文本相似度计算完成。
下面将分别针对上述两部分进行详细展开说明:
1.定位待处理气象预警文本中的各气象预警要素
1.1要素(匹配)模板生成
为了获得不同格式气象预警信息中各要素的上下文模板,需要做以下准备工作(请参见图6所示的流程图):
1)使用网页抓取程序从中国天气网的预警列表页面持续抓取每天的省、市、县级气象预警信息,直至覆盖大多数气象预警信息格式;
2)对收集到的气象预警文本进行人工标注,分别标记6种类型的气象预警要素在住区到的样本气象预警文本中的位置;
3)对每种气象预警要素,批量获取其在样本气象预警文本中的上下文(要素前20个字符至要素后20个字符);
4)对同一种气象预警要素的上下文进行补位填充(用默认字符补齐20个字符上文、20个字符下文)并进行字符级one-hot编码(将每组要素上下文转换为20+20=40个字符的整数型字符ID);
5)以40个字符ID组成每条上下文的特征向量,应用k-means聚类算法,在每种气象预警要素的上下文中进行聚类,设定较大的聚类中心数n,如n=100;
6)聚类完成后,人工评估并筛选所有聚类中心对应的气象预警要素的上下文,将筛选后的气象预警要素的上下文作为要素(匹配)模板;
7)将所有类型的气象预警要素(匹配)模板均转写为正则表达式的形式。将每种类型的气象预警要素的(匹配)模板列表(正则表达式列表)存储在要素(匹配)模板文件中。
1.2要素定位执行
对每种类型的气象预警要素,使用该种类型的气象预警要素对应要素(匹配)模板列表中的所有正则表达式依次对用于测试的气象预警文本进行匹配测试。若匹配成功则完成要素定位,没有模板匹配成功则要素定位失败。
2.对定位抽取出的各气象预警要素进行归一化
2.1时间的归一化(请同步参见如图7所示的流程图)
气象预警信息中的时间包括发布时间和预警时效(有效时间区间):
1)预警发布时间格式统一,总为xxxx年xx月xx日xx时xx分的格式,如2020年10月26日14时25分。只需要去除发布时间字符串中的空格字符,即可完成预警发布时间的归一化。如没有定位到预警发布时间,取当前时间为预警发布时间;
2)预警有效时间区间有3种表述形式:“未来xx小时内”、“未来xx-xx小时”和“xx日xx时xx分至xx日xx时xx分”。分别编写对应的正则表达式,可以提取出相对于预警发布时间的时间区间或不完整的绝对时间。再参考已经完成归一化的预警发布时间,就可以确定预警有效时间区间的完整形式,如2020年9月30日22时30分至2020年10月1日2时30分。
2.2预警类型的归一化
各气象台发布的气象预警信息中,预警类型(如台风、暴雨、暴雪)都是准确无歧义的,不需要进行归一化。
2.3预警等级的归一化
根据《气象灾害预警信号发布与传播办法》和从收集到的气象预警信息中归纳的预警等级别名,以预警等级为单位构造气象预警等级词典。预警等级词典中的条目为特定文本到预警等级标准表述的映射,如“一般预警”->“蓝色预警”、“IV级预警”->“蓝色预警”、“4级预警”->“蓝色预警”、“蓝色预警”->“蓝色预警”。借助预警等级词典,可以快速完成预警信息等级的归一化。
2.4地名的归一化
作为地名归一化准备工作,从中国民政部官网抓取县以上行政区划的树状关系、代码和名称,建立行政区划归属关系表,对每一条行政区划记录,记录其名称、行政区划级别(省、市、县等)和其所有直属下辖行政区划的代码。
执行预警信息的地名归一化时分以下几个步骤(请同步参见如图8所示的流程图):
1)使用正则表达式提取发布预警信息的气象部门所在地的地名,记为name_public。
2)遍历省、市级行政区划记录,将当前行政区划名称记为name_i,计算name_public与name_i的编辑距离edit_diff,进而计算两个名称的文本相似度sim_i=edit_diff/len(name_public)。取文本相似度sim_i最高的行政区划作为发布预警信息气象部门所在地的行政区划。
3)在行政区划归属关系表中查询发布预警信息气象部门所在地的所有直属下辖行政区划的名称,作为“预警区域”要素的候选归一化结果;将发布预警信息气象部门所在地行政区划名称作为“预警区域”要素的默认归一化结果。
4)如果定位“预警区域”要素失败,取发布预警信息气象部门的所在地为预警区域。
5)如果定位“预警区域”时,要素文本不含“、”、“,”、“和”、 “与”等列表元素分隔字符,记要素文本为name_raw,遍历候选归一化结果,将当前归一化结果记为name_j,计算name_raw与name_j的编辑距离edit_diff,进而计算两个名称的文本相似度sim_j=edit_diff/len(name_raw)。取文本相似度sim_j最高的行政区划的名称作为预警区域归一化结果。
6)如果定位“预警区域”时,要素文本包含“、”、“,”、“和”、“与”等列表元素分隔字符,根据分隔符的情况从要素文本中提取地名列表,对列表中每个地名,执行5),进行归一化。将地名归一化结果按位置替换回要素文本中,将替换后的要素文本作为预警区域归一化结果。
进一步的,上述操作的最后将输出经归一化处理后的各气象预警要素,后续可按照一定的顺序组合这些气象预警要素,将得到的关键气象预警文本推送给相应的用户。
进一步参考图9,作为对上述各图所示方法的实现,本申请提供了一种气象预警文本处理装置的一个实施例,该装置实施例与图9所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。
如图9所示,本实施例的气象预警文本处理装置900可以包括:气象预警文本获取单元901、气象预警要素提取单元902、归一化处理及组合单元903。其中,气象预警文本获取单元901,被配置成获取待处理气象预警文本;气象预警要素提取单元902,被配置成利用预设的要素匹配模板从待处理气象预警文本中提取出各实际气象预警要素,要素匹配模板从样本气象预警要素的上下文中聚类得到;归一化处理及组合单元903,被配置成对各实际气象预警要素进行归一化处理,并将得到的各归一化预警要素按照预设顺序组合得到关键气象预警文本。
在本实施例中,气象预警文本处理装置900中:气象预警文本获取单元901、气象预警要素提取单元902、归一化处理及组合单元903的具体处理及其所带来的技术效果可分别参考图2对应实施例中的步骤201-203的相关说明,在此不再赘述。
在本实施例的一些可选的实现方式中,气象预警文本处理装置900 还可以包括:
信息推送区域确定单元,被配置成根据关键气象预警文本中包含的预警区域要素确定信息推送区域;
预警信息推送单元,被配置成通过预设路径将关键气象预警文本推送给位于信息推送区域的用户。
在本实施例的一些可选的实现方式中,气象预警文本处理装置900还可以包括要素匹配模板生成单元,要素匹配模板生成单元可以包括:
样本数据获取子单元,被配置成从气象预警信息的权威发布机构获取样本数据;
要素位置信息获取子单元,被配置成获取样本数据中包含的各类型气象预警要素的位置信息;
上下文信息抽取子单元,被配置成根据位置信息抽取出相应类型的气象预警要素的上下文信息;
聚类处理及要素匹配模板生成子单元,被配置成将相同类型的气象预警要素的上下文信息按预设的聚类中心数进行聚类处理,并根据得到的聚类中心生成相应类型的气象预警要素的要素匹配模板。
在本实施例的一些可选的实现方式中,要素匹配模板生成单元还可以包括:
补位填充子单元,被配置成响应于存在上文或下文信息的实际长度不足预设长度,对相应的气象预警要素的上文或下文信息使用预设字符进行补位填充,直至上文、下文或上文和下文信息的补位后长度为预设长度;
表达形式转换子单元,被配置成将补位后长度为预设长度的上下文信息转换为上下文向量;以及
聚类处理及要素匹配模板生成子单元包括被配置成将相同类型的气象预警要素的上下文信息按预设的聚类中心数进行聚类处理的聚类处理模块,聚类处理模块被进一步配置成:
将相同类型的气象预警要素的上下文向量按预设的聚类中心数进行聚类处理。
在本实施例的一些可选的实现方式中,聚类处理及要素匹配模板生 成子单元包括被配置成根据得到的聚类中心生成相应类型的气象预警要素的要素匹配模板的要素匹配模板生成模块,要素匹配模板生成模块可以被进一步配置成:
根据得到的聚类中心生成相应类型的气象预警要素的正则表达式列表;其中,正则表达式列表中记录有相应类型的气象预警要素的不同聚类中心分别生成的各正则表达式。
在本实施例的一些可选的实现方式中,气象预警文本处理装置900还可以包括:
目标要素出现次数统计单元,被配置成根据预设时间段内发布的各关键气象预警文本,统计得到存在预设的目标气象预警要素或目标气象预警要素组合的实际数量;
统计及预测结果生成单元,被配置成根据实际数量生成统计和预测结果。
本实施例作为对应于上述方法实施例的装置实施例存在,本实施例提供的气象预警文本处理装置所使用的要素匹配模板是预先从样本气象预警要素抽取出的要素上下文经聚类处理得到,每种类型的气象预警要素将对应有相应的要素匹配模板,使用要素上下文作为聚类处理的输入数据得以尽可能的结合上下文语境提升聚类中心的准确性,聚类处理的应用则提升了基于聚类中心生成的要素匹配模板的泛化性,使得各式待处理气象预警文本均能够被较好的提取出准确的气象预警要素。
根据本申请的实施例,本申请还提供了一种电子设备、一种可读存储介质和一种计算机程序产品。
图10示出了可以用来实施本申请的实施例的示例电子设备1000的示意性框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关 系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本申请的实现。
如图10所示,设备1000包括计算单元1001,其可以根据存储在只读存储器(ROM)1002中的计算机程序或者从存储单元1008加载到随机访问存储器(RAM)1003中的计算机程序,来执行各种适当的动作和处理。在RAM 1003中,还可存储设备1000操作所需的各种程序和数据。计算单元1001、ROM 1002以及RAM 1003通过总线1004彼此相连。输入/输出(I/O)接口1005也连接至总线1004。
设备1000中的多个部件连接至I/O接口1005,包括:输入单元1006,例如键盘、鼠标等;输出单元1007,例如各种类型的显示器、扬声器等;存储单元1008,例如磁盘、光盘等;以及通信单元1009,例如网卡、调制解调器、无线通信收发机等。通信单元1009允许设备1000通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。
计算单元1001可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元1001的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元1001执行上文所描述的各个方法和处理,例如气象预警文本处理方法。例如,在一些实施例中,气象预警文本处理方法可被实现为计算机软件程序,其被有形地包含于机器可读介质,例如存储单元1008。在一些实施例中,计算机程序的部分或者全部可以经由ROM 1002和/或通信单元1009而被载入和/或安装到设备1000上。当计算机程序加载到RAM 1003并由计算单元1001执行时,可以执行上文描述的气象预警文本处理方法的一个或多个步骤。备选地,在其他实施例中,计算单元1001可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行气象预警文本处理方法。
本文中以上描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(FPGA)、专用集成电路 (ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。
用于实施本申请的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。
在本申请的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交 互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)和互联网。
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,又称为云计算服务器或云主机,是云计算服务体系中的一项主机产品,以解决传统物理主机与虚拟专用服务器(VPS,Virtual Private Server)服务中存在的管理难度大,业务扩展性弱的缺陷。
本实施例所提供的技术方案中所使用的要素匹配模板是预先从样本气象预警要素抽取出的要素上下文经聚类处理得到,每种类型的气象预警要素将对应有相应的要素匹配模板,使用要素上下文作为聚类处理的输入数据得以尽可能的结合上下文语境提升聚类中心的准确性,聚类处理的应用则提升了基于聚类中心生成的要素匹配模板的泛化性,使得各式待处理气象预警文本均能够被较好的提取出准确的气象预警要素。
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本发公开中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本申请公开的技 术方案所期望的结果,本文在此不进行限制。
上述具体实施方式,并不构成对本申请保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本申请的精神和原则之内所作的修改、等同替换和改进等,均应包含在本申请保护范围之内。

Claims (15)

  1. 一种气象预警文本处理方法,包括:
    获取待处理气象预警文本;
    利用预设的要素匹配模板从所述待处理气象预警文本中提取出各实际气象预警要素,所述要素匹配模板从样本气象预警要素的上下文中聚类得到;
    对各所述实际气象预警要素进行归一化处理,并将得到的各归一化预警要素按照预设顺序组合得到关键气象预警文本。
  2. 根据权利要求1所述的方法,还包括:
    根据所述关键气象预警文本中包含的预警区域要素确定信息推送区域;
    通过预设路径将所述关键气象预警文本推送给位于所述信息推送区域的用户。
  3. 根据权利要求1所述的方法,其中,生成所述要素匹配模板的过程包括:
    从气象预警信息的权威发布机构获取样本数据;
    获取所述样本数据中包含的各类型气象预警要素的位置信息;
    根据所述位置信息抽取出相应类型的气象预警要素的上下文信息;
    将相同类型的气象预警要素的上下文信息按预设的聚类中心数进行聚类处理,并根据得到的聚类中心生成相应类型的气象预警要素的要素匹配模板。
  4. 根据权利要求3所述的方法,还包括:
    响应于存在上文或下文信息的实际长度不足预设长度,对相应的气象预警要素的上文或下文信息使用预设字符进行补位填充,直至上文、下文或上文和下文信息的补位后长度为所述预设长度;
    将补位后长度为所述预设长度的上下文信息转换为上下文向量;以 及
    所述将相同类型的气象预警要素的上下文信息按预设的聚类中心数进行聚类处理,包括:
    将相同类型的气象预警要素的上下文向量按预设的聚类中心数进行聚类处理。
  5. 根据权利要求3所述的方法,其中,所述根据得到的聚类中心生成相应类型的气象预警要素的要素匹配模板,包括:
    根据得到的聚类中心生成相应类型的气象预警要素的正则表达式列表;其中,所述正则表达式列表中记录有相应类型的气象预警要素的不同聚类中心分别生成的各正则表达式。
  6. 根据权利要求1-5任一项所述的方法,还包括:
    根据预设时间段内发布的各关键气象预警文本,统计得到存在预设的目标气象预警要素或目标气象预警要素组合的实际数量;
    根据所述实际数量生成统计和预测结果。
  7. 一种气象预警文本处理装置,包括:
    气象预警文本获取单元,被配置成获取待处理气象预警文本;
    气象预警要素提取单元,被配置成利用预设的要素匹配模板从所述待处理气象预警文本中提取出各实际气象预警要素,所述要素匹配模板从样本气象预警要素的上下文中聚类得到;
    归一化处理及组合单元,被配置成对各所述实际气象预警要素进行归一化处理,并将得到的各归一化预警要素按照预设顺序组合得到关键气象预警文本。
  8. 根据权利要求7所述的装置,还包括:
    信息推送区域确定单元,被配置成根据所述关键气象预警文本中包含的预警区域要素确定信息推送区域;
    预警信息推送单元,被配置成通过预设路径将所述关键气象预警文 本推送给位于所述信息推送区域的用户。
  9. 根据权利要求7所述的装置,还包括要素匹配模板生成单元,所述要素匹配模板生成单元包括:
    样本数据获取子单元,被配置成从气象预警信息的权威发布机构获取样本数据;
    要素位置信息获取子单元,被配置成获取所述样本数据中包含的各类型气象预警要素的位置信息;
    上下文信息抽取子单元,被配置成根据所述位置信息抽取出相应类型的气象预警要素的上下文信息;
    聚类处理及要素匹配模板生成子单元,被配置成将相同类型的气象预警要素的上下文信息按预设的聚类中心数进行聚类处理,并根据得到的聚类中心生成相应类型的气象预警要素的要素匹配模板。
  10. 根据权利要求9所述的装置,所述要素匹配模板生成单元还包括:
    补位填充子单元,被配置成响应于存在上文或下文信息的实际长度不足预设长度,对相应的气象预警要素的上文或下文信息使用预设字符进行补位填充,直至上文、下文或上文和下文信息的补位后长度为所述预设长度;
    表达形式转换子单元,被配置成将补位后长度为所述预设长度的上下文信息转换为上下文向量;以及
    所述聚类处理及要素匹配模板生成子单元包括被配置成将相同类型的气象预警要素的上下文信息按预设的聚类中心数进行聚类处理的聚类处理模块,所述聚类处理模块被进一步配置成:
    将相同类型的气象预警要素的上下文向量按预设的聚类中心数进行聚类处理。
  11. 根据权利要求9所述的装置,其中,所述聚类处理及要素匹配模板生成子单元包括被配置成根据得到的聚类中心生成相应类型的气象预 警要素的要素匹配模板的要素匹配模板生成模块,所述要素匹配模板生成模块被进一步配置成:
    根据得到的聚类中心生成相应类型的气象预警要素的正则表达式列表;其中,所述正则表达式列表中记录有相应类型的气象预警要素的不同聚类中心分别生成的各正则表达式。
  12. 根据权利要求7-11任一项所述的装置,还包括:
    目标要素出现次数统计单元,被配置成根据预设时间段内发布的各关键气象预警文本,统计得到存在预设的目标气象预警要素或目标气象预警要素组合的实际数量;
    统计及预测结果生成单元,被配置成根据所述实际数量生成统计和预测结果。
  13. 一种电子设备,包括:
    至少一个处理器;以及
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-6中任一项所述的气象预警文本处理方法。
  14. 一种存储有计算机指令的非瞬时计算机可读存储介质,所述计算机指令用于使所述计算机执行权利要求1-6中任一项所述的气象预警文本处理方法。
  15. 一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现根据权利要求1-6中任一项所述的气象预警文本处理方法。
PCT/CN2021/100525 2020-12-17 2021-06-17 气象预警文本处理方法、相关装置及计算机程序产品 WO2022127057A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21827554.3A EP4040329A4 (en) 2020-12-17 2021-06-17 METHOD OF PROCESSING A WEATHER WARNING TEXT, RELATED DEVICE AND COMPUTER PROGRAM PRODUCT
US17/646,665 US20220121812A1 (en) 2020-12-17 2021-12-30 Method for processing weather alert text, apparatus and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011492994.5 2020-12-17
CN202011492994.5A CN112560468A (zh) 2020-12-17 2020-12-17 气象预警文本处理方法、相关装置及计算机程序产品

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/646,665 Continuation US20220121812A1 (en) 2020-12-17 2021-12-30 Method for processing weather alert text, apparatus and storage medium

Publications (1)

Publication Number Publication Date
WO2022127057A1 true WO2022127057A1 (zh) 2022-06-23

Family

ID=75064532

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/100525 WO2022127057A1 (zh) 2020-12-17 2021-06-17 气象预警文本处理方法、相关装置及计算机程序产品

Country Status (2)

Country Link
CN (1) CN112560468A (zh)
WO (1) WO2022127057A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117729060A (zh) * 2024-02-07 2024-03-19 中国气象局公共气象服务中心(国家预警信息发布中心) 一种预警信息群发决策方法和装置

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560468A (zh) * 2020-12-17 2021-03-26 北京百度网讯科技有限公司 气象预警文本处理方法、相关装置及计算机程序产品
CN113936432B (zh) * 2021-12-17 2022-03-29 中国气象局公共气象服务中心(国家预警信息发布中心) 一种气象预警图文的生成方法、装置及电子设备
CN114882142B (zh) * 2022-05-13 2022-12-06 北京天译科技有限公司 气象数据的图形产品加工方法及系统
CN116578781B (zh) * 2023-04-28 2023-10-24 北京天译科技有限公司 应用神经网络算法的气象服务推送方法及服务器

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101079026A (zh) * 2007-07-02 2007-11-28 北京百问百答网络技术有限公司 文本相似度、词义相似度计算方法和系统及应用系统
CN106446154A (zh) * 2016-09-21 2017-02-22 广东奥博信息产业有限公司 一种气象信息云推送控制方法
CN108197163A (zh) * 2017-12-14 2018-06-22 上海银江智慧智能化技术有限公司 一种基于裁判文书的结构化处理方法
CN109275133A (zh) * 2018-11-22 2019-01-25 河北冀云气象技术服务有限责任公司 天气预报发送方法及装置
US10216837B1 (en) * 2014-12-29 2019-02-26 Google Llc Selecting pattern matching segments for electronic communication clustering
KR20190105894A (ko) * 2018-03-06 2019-09-18 코나아이 (주) 부채널 템플릿 분석에서의 템플릿 클러스터링 방법 및 저장 매체
CN112560468A (zh) * 2020-12-17 2021-03-26 北京百度网讯科技有限公司 气象预警文本处理方法、相关装置及计算机程序产品

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9424254B2 (en) * 2012-11-29 2016-08-23 Thomson Reuters Global Resoures Systems and methods for natural language generation

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101079026A (zh) * 2007-07-02 2007-11-28 北京百问百答网络技术有限公司 文本相似度、词义相似度计算方法和系统及应用系统
US10216837B1 (en) * 2014-12-29 2019-02-26 Google Llc Selecting pattern matching segments for electronic communication clustering
CN106446154A (zh) * 2016-09-21 2017-02-22 广东奥博信息产业有限公司 一种气象信息云推送控制方法
CN108197163A (zh) * 2017-12-14 2018-06-22 上海银江智慧智能化技术有限公司 一种基于裁判文书的结构化处理方法
KR20190105894A (ko) * 2018-03-06 2019-09-18 코나아이 (주) 부채널 템플릿 분석에서의 템플릿 클러스터링 방법 및 저장 매체
CN109275133A (zh) * 2018-11-22 2019-01-25 河北冀云气象技术服务有限责任公司 天气预报发送方法及装置
CN112560468A (zh) * 2020-12-17 2021-03-26 北京百度网讯科技有限公司 气象预警文本处理方法、相关装置及计算机程序产品

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117729060A (zh) * 2024-02-07 2024-03-19 中国气象局公共气象服务中心(国家预警信息发布中心) 一种预警信息群发决策方法和装置

Also Published As

Publication number Publication date
CN112560468A (zh) 2021-03-26

Similar Documents

Publication Publication Date Title
WO2022127057A1 (zh) 气象预警文本处理方法、相关装置及计算机程序产品
CN108509569A (zh) 企业画像的生成方法、装置、电子设备以及存储介质
JP2023529939A (ja) マルチモーダルpoi特徴の抽出方法及び装置
CN103793372A (zh) 从电子文档中的表格结构提取语义关系
US20230005283A1 (en) Information extraction method and apparatus, electronic device and readable storage medium
WO2022142048A1 (zh) 唤醒指标监测方法、装置及电子设备
CN113963197A (zh) 图像识别方法、装置、电子设备和可读存储介质
CN112507736A (zh) 实时在线社交翻译应用系统
US20220121812A1 (en) Method for processing weather alert text, apparatus and storage medium
CN113553415B (zh) 问答匹配的方法、装置及电子设备
CN115510247A (zh) 一种电碳政策知识图谱构建方法、装置、设备及存储介质
CN110474905B (zh) 实体识别方法、装置、电子设备和存储介质
CN115017256A (zh) 电力数据处理方法、装置、电子设备及存储介质
CN113806522A (zh) 摘要生成方法、装置、设备以及存储介质
CN113360712B (zh) 视频表示的生成方法、装置和电子设备
US20220374603A1 (en) Method of determining location information, electronic device, and storage medium
CN114462364B (zh) 录入信息的方法和装置
CN117272970B (zh) 一种文档生成方法、装置、设备以及存储介质
CN113066498B (zh) 信息处理方法、设备和介质
CN113407749B (zh) 图片索引构建方法、装置、电子设备以及存储介质
CN113836151B (zh) 数据处理方法、装置、电子设备和计算机可读介质
US20220391602A1 (en) Method of federated learning, electronic device, and storage medium
EP4099319A1 (en) Wake-up index monitoring method and apparatus, and electronic device
EP4075425A2 (en) Speech processing method and apparatus, device, storage medium and program
CN114860965A (zh) 基于nlp技术的会议信息记录方法、装置、电子设备及介质

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021827554

Country of ref document: EP

Effective date: 20211230

NENP Non-entry into the national phase

Ref country code: DE