CN113343652A - Text processing method, device, equipment and medium - Google Patents

Text processing method, device, equipment and medium Download PDF

Info

Publication number
CN113343652A
CN113343652A CN202110647402.0A CN202110647402A CN113343652A CN 113343652 A CN113343652 A CN 113343652A CN 202110647402 A CN202110647402 A CN 202110647402A CN 113343652 A CN113343652 A CN 113343652A
Authority
CN
China
Prior art keywords
content
text
preset
address
sending
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110647402.0A
Other languages
Chinese (zh)
Inventor
狄玮杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lightning Express Software Beijing Co ltd
Original Assignee
Lightning Express Software Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lightning Express Software Beijing Co ltd filed Critical Lightning Express Software Beijing Co ltd
Priority to CN202110647402.0A priority Critical patent/CN113343652A/en
Publication of CN113343652A publication Critical patent/CN113343652A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/174Form filling; Merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The application discloses a text processing method, a text processing device, text processing equipment and a storage medium, wherein the method comprises the following steps: after receiving a sending instruction, acquiring a text to be processed containing sending information; analyzing the text to be processed by adopting a preset analysis library, and determining an analysis result, wherein the analysis result comprises at least one mail content parameter; and writing each sending content parameter in the at least one sending content parameter into a sending information input position corresponding to each sending content parameter to generate a sending page. This technical scheme need not the manual writing-in of user and posts a content parameter, only needs adopt preset analysis storehouse, can resolve and obtain and post a content parameter, and then will post a content parameter write in the corresponding post information input position of posting into fast, has improved text write-in efficiency, can generate fast and post a page, and further to a great extent has improved user experience.

Description

Text processing method, device, equipment and medium
Technical Field
The present invention relates generally to the field of information processing technologies, and in particular, to a text processing method, apparatus, device, and medium.
Background
In recent years, with the rapid development of electronic commerce, express services are on a high-speed growth trend, wherein users need to fill in complete text information of a receiver/sender on a social platform, a mobile phone APP or a local computer to complete operations such as receiving/sending. The text information may include: name, address, administrative area, zip code, telephone, and part of the extraneous information.
In the related technology, a user can fill related express information in a mail sending interface in two modes, one mode is to copy and paste texts of the express information to related positions for multiple times, the other mode is to manually input corresponding text contents to corresponding positions, however, the two modes both cause low mail sending information filling efficiency, and lead to poor user experience. Therefore, how to fill in the mail sending information quickly to improve the mail sending efficiency of the user is a problem to be solved at present.
Disclosure of Invention
In view of the above-mentioned drawbacks and deficiencies of the prior art, it is desirable to provide a text processing method, apparatus, device, and medium.
In a first aspect, the present invention provides a text processing method, including:
after receiving a sending instruction, acquiring a text to be processed containing sending information;
analyzing the text to be processed by adopting a preset analysis library, and determining an analysis result, wherein the analysis result comprises at least one mail content parameter;
and writing each sending content parameter in the at least one sending content parameter into a sending information input position corresponding to each sending content parameter to generate a sending page.
In one embodiment, the parsing result includes phone content, address content, and name content, and the parsing processing is performed on the text to be processed by using a preset parsing library to determine the parsing result, including:
extracting at least one numeric string, a name keyword and an address keyword from the text to be processed according to a preset text recognition rule, wherein the name keyword is a reserved word containing a preset name, and the address keyword is a reserved word containing a preset province, city and countryside;
determining telephony content and the address content based on the length of each numeric string and the address key;
determining text containing the name keyword as name content.
In one embodiment, determining the telephony content and the address content based on the length of each string of digits comprises:
judging whether the length of the digit string conforms to a first preset length interval or not;
if the number string conforms to the first preset length interval, determining the number string as the telephone content;
if the length of the digital string does not accord with a first preset length interval, judging whether the length of the digital string accords with a second preset length interval, wherein the minimum threshold value of the first preset length interval is larger than the second preset length threshold value;
and if the number string meets a second preset length threshold, determining the address content of the text corresponding to the number string meeting the second preset length threshold.
In one embodiment, after determining the address content of the text corresponding to the number string meeting the second preset length threshold, the method further includes:
preprocessing the address content to obtain preprocessed address content;
performing word segmentation processing on the preprocessed address content by adopting a blank according to a preset segmentation sequence to obtain a plurality of character strings;
analyzing each character string by adopting a preset word comparison table to determine province, city and county codes;
and removing the character strings corresponding to the province, city and countryside codes and determining an address resolution result.
In one embodiment, the preprocessing the address content to obtain a preprocessed address content includes:
modifying the repeated characters contained in the address content into characters in a unified format according to a preset rule;
and searching and deleting the tail redundant characters in the address content.
In one embodiment, the address resolution result includes a detailed address, and the method is characterized in that each character string is resolved by using a preset word comparison table to determine a province, city and town code, and includes:
judging whether each character string contains prefix words or not;
if the prefix word is contained, judging whether the prefix word is correct or not;
if the prefix word is correct, determining province, city and countryside codes based on the prefix word and a preset word comparison table;
and if the name of the province, city and county connected behind the prefix word is wrong, determining the editing distance between the two adjacent character strings by adopting a dynamic programming algorithm, and determining the province, city and county codes based on the editing distance.
In one embodiment, after determining whether the prefix word is included, the method further includes:
and if the character string does not comprise the prefix word, determining the province, city and countryside codes by adopting a preset word comparison table.
In a second aspect, an embodiment of the present application provides a text processing apparatus, including:
the acquisition module is used for acquiring a to-be-processed text containing the mail sending information after receiving the mail sending instruction;
the analysis module is used for analyzing the text to be processed by adopting a preset analysis library and determining an analysis result, wherein the analysis result comprises at least one mail content parameter;
and the processing module is used for writing each sending content parameter in the at least one sending content parameter into a sending information input position corresponding to each sending content parameter to generate a sending page.
In a third aspect, an embodiment of the present application provides a computer device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the text processing method when executing the computer program.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, the computer program being used for implementing the text processing method according to the first aspect.
To sum up, according to the text processing method, the text processing device, the text processing equipment and the text processing medium provided by the embodiment of the application, after a mail sending instruction is received, a to-be-processed text containing mail sending information is obtained, a preset analysis library is adopted to analyze the to-be-processed text, an analysis result is determined, the analysis result comprises at least one mail sending content parameter, and then each mail sending content parameter in the at least one mail sending content parameter is written into a mail sending information input position corresponding to each mail sending content parameter. This technical scheme need not the manual writing-in of user and posts a content parameter, only needs adopt preset analysis storehouse, can resolve and obtain and post a content parameter, and then will post a content parameter write in the corresponding post information input position of posting into fast to automatic generation posts the page, has improved text write-in efficiency, and further to a great extent has improved user experience.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
fig. 1 is a schematic flowchart of a text processing method according to an embodiment of the present application;
fig. 2 is a schematic flowchart of a text processing method according to another embodiment of the present application;
fig. 3 is a schematic flowchart of a text processing method according to another embodiment of the present application;
fig. 4 is a schematic interface diagram of a text processing method according to another embodiment of the present application;
fig. 5 is a schematic interface diagram of a text processing method according to another embodiment of the present application;
fig. 6 is a schematic interface diagram of a text processing method according to another embodiment of the present application;
FIG. 7 is a schematic structural diagram of a text processing apparatus according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
It can be understood that in the process of rapid development of electronic commerce, a user needs to fill in complete text information of a receiver/sender on a social platform, such as WeChat, a mobile phone APP or a local computer to complete operations such as receiving/sending and the like. In the related art, a user can fill related express delivery information in a delivery interface in two ways, one way is to copy and paste texts of the express delivery information to related positions for multiple times, and the other way is to manually input corresponding text contents to corresponding positions. Therefore, how to fill in the mail sending information quickly to improve the mail sending efficiency of the user is a problem to be solved at present.
Based on the above defects, an embodiment of the present invention provides a text processing method, where a to-be-processed text including a mail piece information is obtained, a preset parsing library is adopted to parse the to-be-processed text, a parsing result is determined, the parsing result includes at least one mail piece content parameter, and then each mail piece content parameter in the at least one mail piece content parameter is written into a mail piece information input position corresponding to each mail piece content parameter, so as to generate a mail piece page. This technical scheme need not the manual writing-in of user and sends a content parameter, only needs adopt preset analysis storehouse, can analyze and obtain sending a content parameter, and then will send a content parameter write in the corresponding sending information input position of sending, has improved text write-in efficiency, and further to a great extent has improved user experience.
The text processing method provided by the embodiment of the application can be applied to terminal equipment, and the terminal equipment can include mobile terminals such as mobile phones, tablet computers, notebook computers, palm computers, Personal Digital Assistants (PDAs), Portable Media Players (PMPs), navigation devices, wearable devices, smart bands, pedometers and the like, and fixed terminals such as Digital TVs, desktop computers and the like.
For convenience of understanding and explanation, the text processing method, apparatus, device and medium provided by the embodiments of the present application are described in detail below with reference to fig. 1 to 6.
It should be noted that the execution main body of the following method embodiments may be a text processing apparatus, and the apparatus may be implemented as part or all of a terminal device by software, hardware, or a combination of software and hardware.
For convenience of understanding and explanation, the text processing method, apparatus, device and storage medium provided by the embodiments of the present application are described in detail below with reference to fig. 1 to 8.
Fig. 1 is a schematic flowchart of a text processing method according to an embodiment of the present invention, and as shown in fig. 1, the method includes:
step S101, after receiving a sending instruction, acquiring a text to be processed containing sending information.
Specifically, the text to be processed may be a text at a chapter level containing the mail information, or may also be a text at a sentence level, and the text to be processed may also include a text in a table format, and the language of the text to be processed in this embodiment may be thailand.
Optionally, the text to be processed may be any text type of text acquired by the computer device, where the text to be processed may be acquired from a position specified by the user, may also be a text to be processed imported through other external devices, and may also be a text to be processed submitted to the computer device by the user, which is not limited in this embodiment. The text to be processed may be one or multiple ones, and each text to be processed may include at least one word.
It should be noted that, when a user needs to send a mail, the mail button may be triggered through a relevant interface of a relevant APP or a public number, so that the server receives a mail instruction of the user, and then the user pastes a text to be processed to a corresponding area on a user interface, so that the server obtains the text to be processed including the mail information.
Step S102, analyzing the text to be processed by adopting a preset analysis library, and determining an analysis result, wherein the analysis result comprises at least one mail content parameter.
In this step, the parsing library is pre-written by the user in programming languages such as C + +, etc. The analysis library is used for analyzing the text to be processed to obtain an analysis result, the analysis result can comprise a plurality of content parameters, and when the text to be processed is mail sending information, the mail sending content parameters obtained after analysis can comprise telephone content, address content and name content.
It should be noted that the analysis library may be compatible with systems such as a PC, an IOS, and an Android.
Optionally, as an implementation manner, on the basis of the foregoing embodiment, as shown in fig. 2, the step S102 may include the following steps:
s201, extracting at least one numeric string, a name keyword and an address keyword from a text to be processed according to a preset text recognition rule.
S202, determining the telephone content and the address content based on the length of each numeric string.
S203, determining the text containing the name keywords as the name content.
Specifically, taking the text to be processed as the mail information, after the text to be processed is acquired, a preset analysis library may be used to analyze the text to be processed, so as to divide the whole text to be processed into an address part, a telephone part and a name part.
The method comprises the steps of extracting a number string, two name keywords and one address keyword from a text to be processed according to a preset text recognition rule, wherein the number string can be any length, the name keywords can be two or one, the name keywords are reserved words containing preset names, and the address keywords are reserved words containing preset provinces, cities and towns and can be stored in an analysis library which is predefined by a user. Various rules can be set in the parsing library, for example, the rules may be setting a first preset length interval and a second preset length threshold corresponding to the numeric string, where a minimum threshold of the first preset length interval is greater than the second preset length threshold, for example, the first preset length interval is 10-11, and the second preset length threshold is 5.
After the digit string is extracted, whether the length of the digit string accords with a first preset length interval or not can be judged, if the length accords with the first preset length interval, the digit string is determined to be telephone content, if the length does not accord with the first preset length interval, whether the length of the digit string accords with a second preset length threshold or not is judged, if the length accords with the second preset length threshold, the text corresponding to the digit string which accords with the second preset length threshold is determined to be address content, the digit string which accords with the second preset length threshold can be determined to be postal code content, and the text containing the postal code content is determined to be the address content.
Illustratively, after extracting a number string and a name key from mail information, it may be judged whether the number string is 10 digits, the number string containing 10 digits is determined as a telephone part, when not 10 digits, it is judged whether it is 5 digits, the number string containing 5 digits is determined as a zip code part, then a text containing the zip code part is determined as address content, and a text containing the name key is determined as name content.
Further, after the determination of the address portion, the following method steps may be included:
s301, address content is preprocessed, and preprocessed address content is obtained.
In this step, the user may pre-construct a word comparison table of province, city and countryside codes, and the word comparison table may conform to some rules, which may be, for example: TH (thailand), TH01(01 is the code corresponding to province), TH0101(01 is the code corresponding to city), TH010101(01 is the code corresponding to county).
Figure BDA0003109634790000071
Figure BDA0003109634790000072
Figure BDA0003109634790000073
It should be noted that many countries and cities in thailand have the same name, for example
Figure BDA0003109634790000074
Possibly the names of 3 villages. Or may be the name of a city. And a name only corresponds to a country, and each code and the name of the province, the city and the country corresponding to the code can be stored in advance, so that the address part can be analyzed according to a word comparison table for constructing the province, the city and the country codes when the network is disconnected.
And a mapping table of the postcode and province can be constructed in advance, since the postcode of Thailand is fixed, the province can be determined, and of course, a few postcodes crossing the province exist. Each province and the corresponding zip code can be stored in the APP in advance, so that the zip code and province mapping table can be constructed in the network disconnection process.
After the address portion is determined, the address portion may be subjected to normalization pre-processing, e.g.
The repeated characters contained in the address content may be modified into characters in a unified format according to a preset rule, for example, multiple spaces may be changed into one space, and unnecessary characters at the end of the address content may be searched and deleted, for example, punctuation marks such as commas, periods and the like at the end may be removed.
And S302, performing word segmentation on the address content after preprocessing by adopting a blank according to a preset segmentation sequence to obtain a plurality of character strings.
Specifically, the preset segmentation order may be an order from right to left, and since the division of words is usually performed by using a space in the thai language, the segmentation of the text may be performed by using a space. Because the habit of the Thai address is: the detailed address, countryside, city and province, so the text of the address part is divided according to the space to form an array, wherein the array is a set of forming words, and the set is in an order, namely, the preprocessed address content is subjected to word segmentation according to the space from right to left to obtain a plurality of character strings.
S303, analyzing each character string by adopting a preset word comparison table, and determining province, city and county codes.
In this step, each word may correspond to a group of codes of province, city and countryside. Each character string may be analyzed in order from right to left and combined with a word look-up table to determine the province, city and county code. It should be noted that the word comparison table is a linked word including "country name + city name (and no space in the middle)", so as to solve the problem that the user inputs an address with no space in the middle of the country name and the city name.
Among them, for the province, city and county prefix words in thai, there are the following, for example: saving:
Figure BDA0003109634790000081
market:
Figure BDA0003109634790000082
rural area:
Figure BDA0003109634790000083
province and city terms:
Figure BDA0003109634790000084
through these special prefix words, it is clear whether the word corresponding to this character string is the name of province, the name of city, or the name of countryside, where the underlined part is the prefix word.
Further, for each character string, whether a prefix word is included or not can be judged, and if the prefix word is included, whether the prefix word is correct or not is judged; if the prefix word is correct, determining province, city and county codes based on the prefix word and a preset word comparison table; and if the province, city and county names connected behind the prefix word are wrong, determining the editing distance between two adjacent character strings by adopting a dynamic programming algorithm, and determining the province, city and county codes based on the editing distance.
If the character string does not contain the prefix word, the province, city and countryside codes can be determined directly through a preset word comparison table. If the word uniquely corresponds to one code. Then, the codes of province, city and countryside can be directly obtained; if the word corresponds to a plurality of codes, the province, the city and the county must be obtained first.
It should be noted that, through some preset rules: for example, in the word lookup table:
Figure BDA0003109634790000091
at this time if present in the address
Figure BDA0003109634790000092
And no province has been obtained yet, the code for which a province is found is considered to be TH 02. If a situation that one name corresponds to a plurality of villages occurs, and the province and city codes are analyzed at the moment, the codes of the villages can be easily acquired.
If the character string contains the prefix word, but the user writes wrongly written characters when writing provinces, cities and countrysides, the editing distance of the two character strings can be calculated by using a dynamic programming algorithm to correct the wrongly written characters of the provinces, the cities and the countrysides. For example, if the user wrote his province, he can correct it by using the zip code. For example, the user wrongly writes names of cities, determines provinces (the provinces can be determined by postcodes or by previously analyzed names) at the moment, and has 5 cities below the provinces, and at the moment, calculates the edit distances between the names of the 5 cities and the names of the cities wrongly written by the user by using a dynamic programming algorithm, and corrects the names of the cities wrongly written by the user by taking the names of the cities with the shortest edit distances as real names of the cities. For example, if the user wrongly writes the name of the country, the correction can be performed by using the above method. After the step 4 is executed, a country code, such as TH010101, should be obtained, if there is a country code, APP in the system can directly select a country after taking the country code, and the user does not need to select a government district by himself.
S304, removing character strings corresponding to the province, city and countryside codes, and determining an address resolution result.
After the province, city and countryside city codes are obtained through analysis, character strings corresponding to the province, city and countryside cities can be removed, and address analysis results can be determined, wherein the address analysis results can be detailed addresses. For example, for Beijing, Chaoyang, Juhe No. 20, the detailed address is: beijing West Lu No. 20 will remove the area facing the sun in Beijing to get the detailed address.
Step S103, writing each content parameter in at least one sending content parameter into a sending information input position corresponding to each content parameter, and generating a sending page.
The method comprises the steps of analyzing a text to be processed by adopting a preset analysis library, identifying a position corresponding to each mail content parameter after obtaining at least one mail content parameter, writing each mail content parameter in the at least one mail content parameter into a mail information input position corresponding to each mail content parameter, and generating a mail page. For example, the telephone content is written in a position corresponding to the telephone content, the name content is written in a position corresponding to the name, and the detailed address is written in a position corresponding to the address.
For example, as shown in fig. 4, when a user is in a mail sending process, the mail to be processed is mail sending information, and the mail sending information is tai, the user can paste the mail sending information of the tai into a specific text box, the system can intelligently identify clipboard content and prompt the user whether to paste, the user can click the "cancel" and "paste" buttons, after the "paste" button is clicked, the mail sending information of the tai is pasted into the "intelligent resolution address box" text box, and then the "resolution" button is clicked, so that the mail sending information is resolved by using a preset resolution library to obtain a resolution result, wherein the resolution result includes telephone content, address content and name content, and the telephone content, the address content and the name content are written into the text boxes respectively corresponding to the telephone content, the address content and the name content. Referring to fig. 5, the sending information may be analyzed by printing small labels in batch to obtain an analysis result, and the analysis result is filled in the corresponding field. Referring to fig. 6, after the address part is obtained through resolution, the address part may be in an "address resolution box," and the address part is further processed to determine an address resolution result, which may be a detailed address, where a user may modify corresponding content in the address resolution box in real time according to the resolution result, so that the address resolution result is saved and used.
In the text processing method provided in this embodiment, after a sending instruction is received, a to-be-processed text including sending information is acquired, a preset analysis library is adopted to analyze the to-be-processed text, an analysis result is determined, the analysis result includes at least one sending content parameter, and then each sending content parameter in the at least one sending content parameter is written into a sending information input position corresponding to each sending content parameter. This technical scheme need not the manual writing-in of user and posts a content parameter, only needs adopt preset analysis storehouse, can resolve and obtain and post a content parameter, and then will post a content parameter write in the corresponding post information input position of posting into fast to automatic generation posts the page, has improved text write-in efficiency, and further to a great extent has improved user experience.
Furthermore, the analysis library provided by the embodiment of the application can be used under the condition that the analysis processing function is not networked, a user does not need to request a server, the problem that the mailing information cannot be automatically filled when the network condition is not good is avoided, and the mailing efficiency is greatly improved.
It should be noted that while the operations of the method of the present invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Rather, the steps depicted in the flowcharts may change the order of execution. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
On the other hand, fig. 7 is a schematic structural diagram of a text processing apparatus according to an embodiment of the present invention. As shown in fig. 7, the apparatus may implement the method shown in fig. 1-3, and the apparatus may include:
the acquiring module 10 is configured to acquire a to-be-processed text containing the sending information after receiving the sending instruction;
the analysis module 20 is configured to perform analysis processing on the text to be processed by using a preset analysis library, and determine an analysis result, where the analysis result includes at least one mail content parameter;
the processing module 30 is configured to write each of the at least one sending content parameter into a sending information input location corresponding to each sending content parameter, so as to generate a sending page.
Optionally, the parsing module 20 includes:
an extracting unit 201, configured to extract at least one numeric string, a name keyword, and an address keyword from a to-be-processed text according to a preset text recognition rule, where the name keyword is a reserved word including a preset name, and the address keyword is a reserved word including preset provinces, cities and towns;
a first determination unit 202 for determining telephone contents and address contents based on the length of each numeric string and the address keyword;
a second determining unit 202, configured to determine a text containing the name keyword as the name content.
Optionally, the parsing module 20 is configured to:
judging whether the length of the digit string conforms to a first preset length interval or not;
if the first preset length interval is met, determining the number string as the telephone content;
if the length of the digital string does not accord with the first preset length interval, judging whether the length of the digital string accords with a second preset length threshold value or not, wherein the minimum threshold value of the first preset length interval is larger than the second preset length threshold value;
and if the text accords with the second preset length threshold, determining the address content of the text corresponding to the numeric string which accords with the second preset length threshold.
Optionally, the apparatus is further configured to:
preprocessing the address content to obtain the preprocessed address content;
performing word segmentation processing on the address content after preprocessing by adopting a blank according to a preset segmentation sequence to obtain a plurality of character strings;
analyzing each character string by adopting a preset word comparison table to determine province, city and county codes;
and removing character strings corresponding to the province, city and countryside codes and determining an address resolution result.
Optionally, the apparatus is further configured to:
modifying repeated characters contained in the address content into characters in a unified format according to a preset rule;
and searching and deleting the tail redundant characters in the address content.
Optionally, the apparatus is further configured to:
judging whether each character string contains prefix words or not;
if the prefix word is contained, judging whether the prefix word is correct or not;
if the prefix word is correct, determining province, city and county codes based on the prefix word and a preset word comparison table;
and if the province, city and county names connected behind the prefix word are wrong, determining the editing distance between two adjacent character strings by adopting a dynamic programming algorithm, and determining the province, city and county codes based on the editing distance.
Optionally, the apparatus is further configured to:
and if the character string does not comprise the prefix word, determining the province, city and countryside codes by adopting a preset word comparison table.
The text processing apparatus provided in this embodiment may execute the embodiments of the method described above, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present invention. As shown in fig. 8, a schematic structural diagram of a computer system 700 suitable for implementing the terminal device or the server of the embodiment of the present application is shown.
As shown in fig. 8, the computer system 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data necessary for the operation of the system 700 are also stored. The CPU701, the ROM702, and the RAM703 are connected to each other via a bus 704. An input/output (I/O) interface 706 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 706 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.
In particular, the process described above with reference to fig. 2 may be implemented as a computer software program, according to an embodiment of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a machine-readable medium, the computer program comprising program code for performing the method of fig. 2. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units or modules described in the embodiments of the present application may be implemented by software or hardware. The described units or modules may also be provided in a processor, and may be described as: a processor includes an acquisition module, a parsing module, and a processing module. The names of the units or modules do not limit the units or modules in some cases, for example, the obtaining module may also be described as "used for obtaining the text to be processed containing the mail information after receiving the mail instruction".
As another aspect, the present application also provides a computer-readable storage medium, which may be the computer-readable storage medium included in the foregoing device in the foregoing embodiment; or it may be a separate computer readable storage medium not incorporated into the device. The computer readable storage medium stores one or more programs for use by one or more processors in performing the text processing methods described herein.
To sum up, according to the text processing method, the text processing device, the text processing equipment and the text processing medium provided by the embodiment of the application, after a mail sending instruction is received, a to-be-processed text containing mail sending information is obtained, a preset analysis library is adopted to analyze the to-be-processed text, an analysis result is determined, the analysis result comprises at least one mail sending content parameter, and then each mail sending content parameter in the at least one mail sending content parameter is written into a mail sending information input position corresponding to each mail sending content parameter. This technical scheme need not the manual writing-in of user and posts a content parameter, only needs adopt preset analysis storehouse, can resolve and obtain and post a content parameter, and then will post a content parameter write in the corresponding post information input position of posting into fast to automatic generation posts the page, has improved text write-in efficiency, and further to a great extent has improved user experience.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (10)

1. A method of text processing, the method comprising:
after receiving a sending instruction, acquiring a text to be processed containing sending information;
analyzing the text to be processed by adopting a preset analysis library, and determining an analysis result, wherein the analysis result comprises at least one mail content parameter;
and writing each sending content parameter in the at least one sending content parameter into a sending information input position corresponding to each sending content parameter to generate a sending page.
2. The method of claim 1, wherein the parsing result includes phone content, address content and name content, and the parsing the text to be processed using a preset parsing library to determine the parsing result includes:
extracting at least one numeric string, a name keyword and an address keyword from the text to be processed according to a preset text recognition rule, wherein the name keyword is a reserved word containing a preset name, and the address keyword is a reserved word containing a preset province, city and countryside;
determining telephony content and the address content based on the length of each numeric string and the address key;
determining text containing the name keyword as name content.
3. The method of claim 2, wherein determining telephony content and the address content based on the length of each string of digits comprises:
judging whether the length of the digit string conforms to a first preset length interval or not;
if the number string conforms to the first preset length interval, determining the number string as the telephone content;
if the length of the digital string does not accord with the first preset length interval, judging whether the length of the digital string accords with a second preset length threshold value, wherein the minimum threshold value of the first preset length interval is larger than the second preset length threshold value;
and if the number string meets a second preset length threshold, determining the address content of the text corresponding to the number string meeting the second preset length threshold.
4. The method according to claim 3, wherein after determining the address content from the text corresponding to the numeric string conforming to the second preset length interval, the method further comprises:
preprocessing the address content to obtain preprocessed address content;
performing word segmentation processing on the preprocessed address content by adopting a blank according to a preset segmentation sequence to obtain a plurality of character strings;
analyzing each character string by adopting a preset word comparison table to determine province, city and county codes;
and removing the character strings corresponding to the province, city and countryside codes and determining an address resolution result.
5. The method of claim 4, wherein preprocessing the address content to obtain a preprocessed address content comprises:
modifying the repeated characters contained in the address content into characters in a unified format according to a preset rule;
and searching and deleting the tail redundant characters in the address content.
6. The method of claim 4, wherein the address resolution result comprises a detailed address, and wherein the step of determining the provincial-urban-rural code by parsing each character string using a preset word comparison table comprises:
judging whether each character string contains prefix words or not;
if the prefix word is contained, judging whether the prefix word is correct or not;
if the prefix word is correct, determining province, city and countryside codes based on the prefix word and a preset word comparison table;
and if the name of the province, city and county connected behind the prefix word is wrong, determining the editing distance between the two adjacent character strings by adopting a dynamic programming algorithm, and determining the province, city and county codes based on the editing distance.
7. The method of claim 6, wherein after determining whether to include a prefix word, the method further comprises:
and if the character string does not comprise the prefix word, determining the province, city and countryside codes by adopting a preset word comparison table.
8. A text processing apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring a to-be-processed text containing the mail sending information after receiving the mail sending instruction;
the analysis module is used for analyzing the text to be processed by adopting a preset analysis library and determining an analysis result, wherein the analysis result comprises at least one mail content parameter;
and the processing module is used for writing each sending content parameter in the at least one sending content parameter into a sending information input position corresponding to each sending content parameter to generate a sending page.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the method of any of claims 1-7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 1-7.
CN202110647402.0A 2021-06-10 2021-06-10 Text processing method, device, equipment and medium Pending CN113343652A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110647402.0A CN113343652A (en) 2021-06-10 2021-06-10 Text processing method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110647402.0A CN113343652A (en) 2021-06-10 2021-06-10 Text processing method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN113343652A true CN113343652A (en) 2021-09-03

Family

ID=77475691

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110647402.0A Pending CN113343652A (en) 2021-06-10 2021-06-10 Text processing method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN113343652A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114240573A (en) * 2021-12-23 2022-03-25 广州华多网络科技有限公司 Order receiving information matching method and device, equipment, medium and product thereof

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114240573A (en) * 2021-12-23 2022-03-25 广州华多网络科技有限公司 Order receiving information matching method and device, equipment, medium and product thereof

Similar Documents

Publication Publication Date Title
CN117056471A (en) Knowledge base construction method and question-answer dialogue method and system based on generation type large language model
CN110276023B (en) POI transition event discovery method, device, computing equipment and medium
CN110909548A (en) Chinese named entity recognition method and device and computer readable storage medium
US11580314B2 (en) Document translation method and apparatus, storage medium, and electronic device
WO2022083094A1 (en) Text semantic recognition method and apparatus, electronic device, and storage medium
CN108874771A (en) A kind of information extraction method towards bid text
RU2406142C2 (en) System and method of storing documents in serial binary format
US11763103B2 (en) Video translation method and apparatus, storage medium, and electronic device
CN112528174A (en) Address finishing and complementing method based on knowledge graph and multiple matching and application
CN111597800B (en) Method, device, equipment and storage medium for obtaining synonyms
CN101561725A (en) Method and system of fast handwriting input
CN111079410A (en) Text recognition method and device, electronic equipment and storage medium
CN107609032B (en) Matching method and electronic equipment
CN110516125B (en) Method, device and equipment for identifying abnormal character string and readable storage medium
CN110020429B (en) Semantic recognition method and device
CN113343652A (en) Text processing method, device, equipment and medium
CN114298039A (en) Sensitive word recognition method and device, electronic equipment and storage medium
CN114528840A (en) Chinese entity identification method, terminal and storage medium fusing context information
CN110134920A (en) Draw the compatible display methods of text, device, terminal and computer readable storage medium
CN114386407B (en) Word segmentation method and device for text
CN111783433A (en) Text retrieval error correction method and device
CN112989154A (en) Short title generation method and device
CN112417851B (en) Text error correction word segmentation method and system and electronic equipment
CN111680122B (en) Space data active recommendation method and device, storage medium and computer equipment
CN113486148A (en) PDF file conversion method and device, electronic equipment and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination