CN110147429A - Text comparative approach, device, computer equipment and storage medium - Google Patents

Text comparative approach, device, computer equipment and storage medium Download PDF

Info

Publication number
CN110147429A
CN110147429A CN201910297625.1A CN201910297625A CN110147429A CN 110147429 A CN110147429 A CN 110147429A CN 201910297625 A CN201910297625 A CN 201910297625A CN 110147429 A CN110147429 A CN 110147429A
Authority
CN
China
Prior art keywords
text
match point
axis
traversal
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910297625.1A
Other languages
Chinese (zh)
Other versions
CN110147429B (en
Inventor
余宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910297625.1A priority Critical patent/CN110147429B/en
Publication of CN110147429A publication Critical patent/CN110147429A/en
Application granted granted Critical
Publication of CN110147429B publication Critical patent/CN110147429B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application involves big data fields, this application discloses a kind of text comparative approach, device, computer equipment and storage mediums, the described method includes: obtaining the first text and the second text, first text and second text are converted into single line text respectively, and by after conversion first text and second text be respectively mapped to X-axis and Y-axis;Second text to first text in X-axis and in Y-axis carries out traversal queries, obtains the match point information of same text in first text and second text;It is counted according to first text and the match point information of same text in second text, obtains text comparison result.The application finds out the identical characters between text according to the shortest distance between most short identical characters by the way that text to be compared is mapped to two-dimensional surface, improves the efficiency that text compares, reduces the complexity that text compares.

Description

Text comparative approach, device, computer equipment and storage medium
Technical field
This application involves big data field, in particular to a kind of text comparative approach, device, computer equipment and storage are situated between Matter.
Background technique
In daily use, text is relatively a relatively common problem, and application scenarios are also than wide, such as paper ratio Equity.The core that text compares is exactly the difference compared between two given texts (can be byte stream etc.).Currently, mainstream Comparison text between difference mainly have two major classes.One kind is calculated based on editing distance (Edit Distance), such as LD Method.One kind is based on Longest Common Substring (Longest Common Subsequence), such as Needleman/Wunsch Algorithm etc..But algorithm above is all more complicated, and consuming resource is serious, inefficiency.
Summary of the invention
The purpose of the application is in view of the deficiencies of the prior art, to provide a kind of text comparative approach, device, computer and set Standby and storage medium by the way that text to be compared is mapped to two-dimensional surface, and is looked for according to the shortest distance between most short identical characters Identical characters between text out improve the efficiency that text compares, and reduce the complexity that text compares.
In order to achieve the above objectives, the technical solution of the application provide a kind of text comparative approach, device, computer equipment and Storage medium.
This application discloses a kind of text comparative approach, comprising the following steps:
The first text and the second text are obtained, first text and second text are converted respectively literary in single file Word, and by after conversion first text and second text be respectively mapped to X-axis and Y-axis;
Second text to first text in X-axis and in Y-axis carries out traversal queries, obtains described the The match point information of same text in one text and second text;
It is counted according to first text and the match point information of same text in second text, obtains text Comparison result.
Preferably, first text by after conversion and second text are respectively mapped to X-axis and Y-axis, packet It includes:
First text after conversion is mapped to any quadrant of X-axis, second text after conversion is mapped To the quadrant identical with first text of Y-axis;
First text of first text after conversion is corresponded into any one coordinate points on the affiliated quadrant of X-axis, it will First text of second text after conversion corresponds to any one coordinate points on the affiliated quadrant of Y-axis.
It is looked into preferably, first text in X-axis and second text in Y-axis carry out traversal It askes, obtains the match point information of same text in first text and second text, comprising:
Second text to first text in X-axis and in Y-axis carries out traversal queries, obtains first With information;
Traverse region according to the first match point acquisition of information, and on the traversal region to first text and Second text carries out traversal queries, obtains remaining match point information.
It is looked into preferably, first text in X-axis and second text in Y-axis carry out traversal It askes, obtains the first match point information, comprising:
Second text to first text in X-axis and in Y-axis carries out traversal queries, obtains described the One text coordinate points corresponding with same text in second text;
Inquiry and the nearest coordinate points of initial point distance in the corresponding coordinate points of the same text, will described and origin away from The first match point is labeled as from nearest coordinate points.
Preferably, described traverse region according to the first match point acquisition of information, and to institute on the traversal region It states the first text and second text carries out traversal queries, obtain remaining match point information, comprising:
The corresponding coordinate points of the last one text in first text and second text are obtained, by the coordinate points Rectangular area between coordinate points corresponding with first match point is as traversal region, to described on the traversal region First text and second text carry out traversal queries;
When getting new match point, the traversal region is updated, and continue on the new traversal region Traversal queries, until occurring without next match point.
Preferably, it is described when getting new match point, the traversal region is updated, and in the new traversal region On continue traversal queries, until without next match point occur until, comprising:
It is when getting new match point, the last one text in first text and second text is corresponding Rectangular area between coordinate points coordinate points corresponding with the new match point is as new traversal region;
Traversal queries are carried out to the region in addition to the new match point on the new traversal region, until not having Until next match point occurs.
Preferably, described unite according to first text and the match point information of same text in second text Meter obtains text comparison result, comprising:
According to the number of the match point Information Statistics match point of same text in first text and second text;
The word length of first text and second text is obtained, and according to the smaller text in the word length Word length and the number of the match point obtain text comparison result.
Disclosed herein as well is a kind of text comparison unit, described device includes:
Text mapping block: being set as obtaining the first text and the second text, by first text and second text This is converted into single line text respectively, and by after conversion first text and second text be respectively mapped to X-axis and Y Axis;
Match point enquiry module: be set as to first text in X-axis and second text in Y-axis into Row traversal queries obtain the match point information of same text in first text and second text;
Text comparison module: it is set as being believed according to the match point of same text in first text and second text Breath is counted, and text comparison result is obtained.
Disclosed herein as well is a kind of computer equipment, the computer equipment includes memory and processor, described to deposit Computer-readable instruction is stored in reservoir to be made when the computer-readable instruction is executed by one or more processors Obtain the step of one or more processors execute text comparative approach described above.
Disclosed herein as well is a kind of storage medium, the storage medium can be read and write by processor, and the storage medium is deposited Computer instruction is contained, when the computer-readable instruction is executed by one or more processors, so that one or more processing Device executes the step of text comparative approach described above.
The beneficial effect of the application is: the application is by being mapped to two-dimensional surface for text to be compared, and according to most short phase The shortest distance with intercharacter finds out the identical characters between text, improves the efficiency that text compares, and reduces text and compares Complexity.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of text comparative approach of the embodiment of the present application;
Fig. 2 is a kind of flow diagram of text comparative approach of the embodiment of the present application;
Fig. 3 is a kind of flow diagram of text comparative approach of the embodiment of the present application;
Fig. 4 is a kind of flow diagram of text comparative approach of the embodiment of the present application;
Fig. 5 is a kind of flow diagram of text comparative approach of the embodiment of the present application;
Fig. 6 is a kind of flow diagram of text comparative approach of the embodiment of the present application;
Fig. 7 is a kind of flow diagram of text comparative approach of the embodiment of the present application
Fig. 8 is a kind of text comparison unit structural schematic diagram of the embodiment of the present application.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, and It is not used in restriction the application.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singular " one " used herein, " one It is a ", " described " and "the" may also comprise plural form.It is to be further understood that being arranged used in the description of the present application Diction " comprising " refer to that there are the feature, integer, step, operation, element and/or component, but it is not excluded that in the presence of or addition Other one or more features, integer, step, operation, element, component and/or their group.
A kind of text comparative approach process of the embodiment of the present application as shown in Figure 1, the present embodiment the following steps are included:
Step s101 obtains the first text and the second text, first text and second text is converted respectively Text in single file, and by after conversion first text and second text be respectively mapped to X-axis and Y-axis;
Specifically, the original text usually obtained is all the text comprising multline text, and since the margin of setting is not away from Together, the quantity of every row text is likely to difference, therefore after getting two texts for needing to compare, can be by the need Two texts to be compared all are converted into single line text, i.e., all convert multline text in a row, and carry out by the text The text is respectively mapped in X-axis and Y-axis after conversion, such as the text of first text is mapped to X-axis, second text This text is mapped in Y-axis;Wherein, for convenience of calculation, the corresponding coordinate of each text can be integer numerical value, and occupy One number, such as the coordinate of first text, first text can be (1,0), then the coordinate of second text be (2, 0), and so on, similarly the coordinate of the first of second text text can be (0,1), then the coordinate of second text It is (0,2).
Step s102, second text to first text in X-axis and in Y-axis carry out traversal queries, Obtain the match point information of same text in first text and second text;
Specifically, firstly the need of first match point in two texts is found out, the match point is text in two texts Identical and nearest from origin coordinates point, the origin coordinates depend on two texts in the initial mapping position of X-axis and Y-axis, For example, origin coordinates can be original if first text of first text and second text is all corresponding origin Point can be by first text and second if first text of first text and second text does not correspond to origin The corresponding coordinate of first text of a text is reflected as origin coordinates to first text and second text When penetrating, it is necessary to which first text and second text are placed on same quadrant.
Specifically, described can be by will be on the text and Y-axis in first text in X-axis to searching for the first match point Second text in text carry out traversal queries, find out all identical texts in two texts, and record the phase The corresponding coordinate with text, for example, the B text in second text in the A text and Y-axis in first text in X-axis It is identical, and the corresponding coordinate of the A text is (a, 0) in first text in X-axis, the B text in second text in Y-axis Corresponding coordinate is (0, b), then the corresponding coordinate of identical text is (a, b) in two texts, can similarly obtain two texts In the corresponding coordinate of remaining same text, at this moment can inquire in the corresponding coordinate of all same texts from origin coordinates Nearest coordinate, the nearest coordinate are exactly the first match point coordinate.
Specifically, after getting the first match point information, it is described to continue in first text and second text Other match point information are searched by traversal queries, when getting all match points in first text and second text After information, the match point information is stored, coordinate of the match point information comprising match point and match point are corresponding Text.
Step s103 unites according to first text and the match point information of same text in second text Meter obtains text comparison result.
Specifically, can unite first after getting match point information all in first text and second text The quantity of all match points is counted, is then compared the word length of the word length of first text and second text, Obtaining that lesser word length in two text sizes can be any if the word length of two texts is the same The word length of one of text is selected, is finally obtained with the quantity of all match points divided by the word length of text The similarity of two texts.
Specifically, similarity threshold can also be preset, it, can be by institute after getting the similarity of two texts The similarity for stating acquisition is compared with preset similarity threshold, if the similarity of the acquisition is not less than preset similar Threshold value is spent, then it is considered that two texts are consistent, otherwise it is considered that two texts are inconsistent.
In the present embodiment, by the way that text to be compared is mapped to two-dimensional surface, and according to most short between most short identical characters Distance finds out the identical characters between text, improves the efficiency that text compares, and reduces the complexity that text compares.
Fig. 2 is a kind of text comparative approach flow diagram of the embodiment of the present application, as shown, the step s101, By after conversion first text and second text be respectively mapped to X-axis and Y-axis, comprising:
First text after conversion is mapped to any quadrant of X-axis by step s201, by described second after conversion Text is mapped to the quadrant identical with first text of Y-axis;
Specifically, first text and second text can be mapped to any one when mapping the text A quadrant, but it is to ensure that first text and second text in the same quadrant, are determining quadrant and then by first A text is mapped to X-axis, and second text is mapped to Y-axis.
Either one or two of step s202, first text of first text after conversion is corresponded on the affiliated quadrant of X-axis First text of second text after conversion is corresponded to any one coordinate points on the affiliated quadrant of Y-axis by coordinate points.
Specifically, needing to select first text in first text corresponding when first text is mapped to X-axis Coordinate, the coordinate needs and the quadrant of selection is corresponding, for example, if selection first quartile, first in first text The range of choice of the corresponding coordinate of a text is [0, ∞], similarly, can be equally to when second text is mapped to Y-axis The corresponding coordinate of first text is selected in two texts.
Specifically, understanding to simplify, first quartile is may be selected in the quadrant of first text and second text, when to institute When stating text progress two-dimensional surface mapping, first text of first text can be mapped to origin or closer from origin Distance, such as coordinate points (1,0), by first text of second text be mapped to origin or from origin it is closer with a distance from, such as Coordinate points (0,1).
In the present embodiment, by carrying out Planar Mapping to text information, the identical of text can be conveniently found out by coordinate The efficiency that text compares is improved at place.
Fig. 3 is a kind of text comparative approach flow diagram of the embodiment of the present application, as shown, the step s102, Second text to first text in X-axis and in Y-axis carries out traversal queries, obtain first text with The match point information of same text in second text, comprising:
Step s301, second text to first text in X-axis and in Y-axis carry out traversal queries, Obtain the first match point information;
Specifically, described can be by will be on the text and Y-axis in first text in X-axis to searching for the first match point Second text in text carry out traversal queries, find out all identical texts in two texts, and record the phase The corresponding coordinate with text, for example, the B text in second text in the A text and Y-axis in first text in X-axis It is identical, and the corresponding coordinate of the A text is (a, 0) in first text in X-axis, the B text in second text in Y-axis Corresponding coordinate is (0, b), then the corresponding coordinate of identical text is (a, b) in two texts, can similarly obtain two texts In the corresponding coordinate of remaining same text;The traversal queries can first determine first text in first text, then root Traversal queries are carried out since first text in second text according to first text, are found all with described first The identical text of first text, writes down the coordinate of the text in a text, then determines second in first text again A text continues to carry out traversal queries since first text in second text according to second text, find All texts identical with second text in first text, write down the coordinate of the text, and so on, Zhi Dao Until all texts in one text finish traversal queries;
Specifically, after finding the coordinate of all same texts in first text and second text, it can be according to institute Coordinate inquiry and origin coordinates are stated apart from nearest coordinate, the described and nearest coordinate of origin coordinates is exactly that the first match point is sat Mark, the origin coordinates depend on two texts in the initial mapping position of X-axis and Y-axis, for example, if first text and the First text of two texts is all corresponding origin, then origin coordinates can be origin, if first text and second First text of a text does not correspond to origin, then can be corresponding by first text of first text and second text Coordinate is as origin coordinates.
Step s302 traverses region according to the first match point acquisition of information, and to described on the traversal region First text and second text carry out traversal queries, obtain remaining match point information.
It, can be by last of first text and second text specifically, after getting first match point Rectangular area between a corresponding coordinate of text and the first match point respective coordinates is used as traversal region, and at described time It goes through on region and traversal queries is carried out to first text and second text, obtain remaining match point information.When first After getting all match point information in a text and second text, the match point information is stored, described It include the coordinate and the corresponding text of match point of match point with information.
In the present embodiment, traversal region is obtained by the inquiry to the first match point, and according to first match point, is obtained Remaining match point is taken, can effectively improve text relative efficiency.
Fig. 4 is a kind of text comparative approach flow diagram of the embodiment of the present application, as shown, the step s301, Second text to first text in X-axis and in Y-axis carries out traversal queries, obtains the first match point letter Breath, comprising:
Step s401, second text to first text in X-axis and in Y-axis carry out traversal queries, Obtain first text coordinate points corresponding with same text in second text;
Specifically, described can be by will be on the text and Y-axis in first text in X-axis to searching for the first match point Second text in text carry out traversal queries, find out all identical texts in two texts, and record the phase The corresponding coordinate with text, for example, the B text in second text in the A text and Y-axis in first text in X-axis It is identical, and the corresponding coordinate of the A text is (a, 0) in first text in X-axis, the B text in second text in Y-axis Corresponding coordinate is (0, b), then the corresponding coordinate of identical text is (a, b) in two texts, can similarly obtain two texts In the corresponding coordinate of remaining same text;The traversal queries can first determine first text in first text, then root Traversal queries are carried out since first text in second text according to first text, are found all with described first The identical text of first text, writes down the coordinate of the text in a text, then determines second in first text again A text continues to carry out traversal queries since first text in second text according to second text, find All texts identical with second text in first text, write down the coordinate of the text, and so on, Zhi Dao Until all texts in one text finish traversal queries.
Step s402, inquiry and the nearest coordinate points of initial point distance in the corresponding coordinate points of the same text, by institute It states with the nearest coordinate points of initial point distance labeled as the first match point.
Specifically, after finding the coordinate of all same texts in first text and second text, it can be according to institute Coordinate inquiry and origin coordinates are stated apart from nearest coordinate, the shortest distance can be calculated by Pythagorean theorem and be obtained, it is described with The nearest coordinate of origin coordinates is exactly the first match point coordinate, the origin coordinates depend on two texts X-axis and Y-axis just Beginning mapping position, for example, starting is sat if first text of first text and second text is all corresponding origin Mark can be origin, can be by first if first text of first text and second text does not correspond to origin The corresponding coordinate of first text of text and second text is as origin coordinates.
In the present embodiment, shortest distance comparison is carried out by the coordinate to all same texts, it can be with quick obtaining first Match point effectively improves text relative efficiency.
Fig. 5 is a kind of text comparative approach flow diagram of the embodiment of the present application, as shown, the step s302, Region is traversed according to the first match point acquisition of information, and to first text and described second on the traversal region Text carries out traversal queries, obtains remaining match point information, comprising:
Step s501 obtains the corresponding coordinate points of the last one text in first text and second text, will Rectangular area between coordinate points coordinate points corresponding with first match point is as traversal region, in the traversal area Traversal queries are carried out to first text and second text on domain;
Specifically, can be from the corresponding coordinate points of the last one text in first text are obtained in X-axis, it can also be from Y-axis The corresponding coordinate points of the last one text in second text are obtained, according to the corresponding seat of the last one text in first text The last one text is corresponding in available two texts of the corresponding coordinate points of the last one text in punctuate and second text Coordinate points, for example, the corresponding coordinate points of the last one text are (A, 0) in first text, last in second text The corresponding coordinate points of a text are (0, B), then the corresponding coordinate points of the last one text are (A, B) in two texts, by institute State the rectangle region between the coordinate points corresponding with first match point of the corresponding coordinate points of the last one text in two texts Domain is used as traversal region, for example, if the coordinate points of the first match point are (C, D), then traversal region is (C, D) to (A, B), After determining traversal region, so that it may carry out traversal to first text and second text on the traversal region and look into It askes.
Step s502 updates the traversal region, and on the new traversal region when getting new match point Continue traversal queries, until occurring without next match point.
Specifically, when carrying out traversal queries on the traversal region, it can be by calculating and first match point The shortest distance obtains new match point, and when getting new match point, traversal area can be redefined according to step s501 Domain, the new traversal region determine by the corresponding coordinate points of the last one text in new match point and two texts, for example, If the coordinate points of new match point are (E, F), then traversal region is (E, F) to (A, B), on the new traversal region When carrying out traversal queries, next match point can be obtained with the shortest distance of the new match point by calculating, and so on, Until occurring without next match point.
In the present embodiment, pass through the update to traversal region, it is possible to reduce the range of traversal queries reduces calculation amount, mentions High text relative efficiency.
Fig. 6 is a kind of text comparative approach flow diagram of the embodiment of the present application, as shown, the step s502, When getting new match point, the traversal region is updated, and continue traversal queries on the new traversal region, Until occurring without next match point, comprising:
Step s601, when getting new match point, by the last one in first text and second text Rectangular area between the corresponding coordinate points of text coordinate points corresponding with the new match point is as new traversal region;
Specifically, traversal region can be redefined according to step s501 when getting new match point, it is described new Traversal region determined by the corresponding coordinate points of the last one text in new match point and two texts, for example, if new The coordinate points of match point are (E, F), then traversal region is (E, F) to (A, B).
Step s602 carries out traversal to the region in addition to the new match point on the new traversal region and looks into It askes, until occurring without next match point.
Specifically, when carrying out traversal queries on the new traversal region, it can be by calculating and the new match point The shortest distance obtain next match point, and so on, until until without the appearance of next match point, wherein described When carrying out traversal queries on new traversal region, the new match point exclusion can traversed except region, that is, do not needed pair The new match point carries out traversal queries.
Specifically, can constantly obtain new match point, and according to the new matching after by looping through inquiry Point updates traversal region, and when determining next match point, the calculating of the shortest distance is according to same text coordinate and upper one The shortest distance of match point, after passing through traversal queries and minimum distance calculation, if this not new match point generates, It can terminate traversal queries, this text compares end.
In the present embodiment, all match point information are obtained by looping through inquiry, text is can effectively improve and compares effect Rate reduces complexity.
Fig. 7 is a kind of text comparative approach flow diagram of the embodiment of the present application, as shown, the step s103, It is counted according to first text and the match point information of same text in second text, obtains text and compare knot Fruit, comprising:
Step s701 is matched according to first text with the match point Information Statistics of same text in second text The number of point;
Specifically, can unite first after getting match point information all in first text and second text Count the quantity of all match points.
Step s702 obtains the word length of first text and second text, and according to the word length In smaller word length and the match point number obtain text comparison result.
Specifically, the word length of the word length of first text and second text can be compared, two are obtained That lesser word length can arbitrarily select it if the word length of two texts is the same in a text size In a text word length, the quantity of all match points is finally obtained with two texts divided by the word length of text This similarity.
In the present embodiment, the similarity between two texts is obtained by the number of match point, text can be effectively reduced More complicated degree.
A kind of text comparison unit structure of the embodiment of the present application is as shown in Figure 8, comprising:
Text mapping block 801, match point enquiry module 802 and text comparison module 803;Wherein, text mapping block 801 are connected with match point enquiry module 802, and match point enquiry module 802 is connected with text comparison module 803;Text maps mould Block 801 is set as obtaining the first text and the second text, and first text and second text are converted in single file respectively Text, and by after conversion first text and second text be respectively mapped to X-axis and Y-axis;Match point enquiry module 802, which are set as second texts to first text in X-axis and in Y-axis, carries out traversal queries, described in acquisition The match point information of same text in first text and second text;Text comparison module 803 is set as according to described One text and the match point information of same text in second text are counted, and text comparison result is obtained.
The embodiment of the present application also discloses a kind of computer equipment, and the computer equipment includes memory and processor, Computer-readable instruction is stored in the memory, the computer-readable instruction is executed by one or more processors When, so that one or more processors execute the step in text comparative approach described in the various embodiments described above.
The embodiment of the present application also discloses a kind of storage medium, and the storage medium can be read and write by processor, the storage Device is stored with computer-readable instruction, when the computer-readable instruction is executed by one or more processors so that one or Multiple processors execute the step in text comparative approach described in the various embodiments described above.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, which can be stored in a computer-readable storage and be situated between In matter, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, storage medium above-mentioned can be The non-volatile memory mediums such as magnetic disk, CD, read-only memory (Read-Only Memory, ROM) or random storage note Recall body (Random Access Memory, RAM) etc..
Each technical characteristic of embodiment described above can be combined arbitrarily, for simplicity of description, not to above-mentioned reality It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited In contradiction, all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously The limitation to the application the scope of the patents therefore cannot be interpreted as.It should be pointed out that for those of ordinary skill in the art For, without departing from the concept of this application, various modifications and improvements can be made, these belong to the guarantor of the application Protect range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.

Claims (10)

1. a kind of text comparative approach, which comprises the following steps:
The first text and the second text are obtained, first text and second text are converted into single line text respectively, and By after conversion first text and second text be respectively mapped to X-axis and Y-axis;
Second text to first text in X-axis and in Y-axis carries out traversal queries, obtains first text The match point information of this and same text in second text;
It is counted according to first text and the match point information of same text in second text, obtains text and compare As a result.
2. text comparative approach as described in claim 1, which is characterized in that first text and institute by after conversion It states the second text and is respectively mapped to X-axis and Y-axis, comprising:
First text after conversion is mapped to any quadrant of X-axis, second text after conversion is mapped to Y-axis Quadrant identical with first text;
First text of first text after conversion is corresponded into any one coordinate points on the affiliated quadrant of X-axis, will be converted First text of second text afterwards corresponds to any one coordinate points on the affiliated quadrant of Y-axis.
3. text comparative approach as claimed in claim 2, which is characterized in that first text in X-axis and Second text in Y-axis carries out traversal queries, obtains same text in first text and second text Match point information, comprising:
Second text to first text in X-axis and in Y-axis carries out traversal queries, obtains the first match point Information;
Region is traversed according to the first match point acquisition of information, and to first text and described on the traversal region Second text carries out traversal queries, obtains remaining match point information.
4. text comparative approach as claimed in claim 3, which is characterized in that first text in X-axis and Second text in Y-axis carries out traversal queries, obtains the first match point information, comprising:
Second text to first text in X-axis and in Y-axis carries out traversal queries, obtains first text This coordinate points corresponding with same text in second text;
Inquiry and the nearest coordinate points of initial point distance in the corresponding coordinate points of the same text, most by described and initial point distance Close coordinate points are labeled as the first match point.
5. text comparative approach as claimed in claim 3, which is characterized in that described according to the first match point acquisition of information Region is traversed, and traversal queries are carried out to first text and second text on the traversal region, obtains remaining Match point information, comprising:
The corresponding coordinate points of the last one text in first text and second text are obtained, by the coordinate points and institute The rectangular area between the corresponding coordinate points of the first match point is stated as traversal region, to described first on the traversal region Text and second text carry out traversal queries;
When getting new match point, the traversal region is updated, and continue to traverse on the new traversal region Inquiry, until occurring without next match point.
6. text comparative approach as claimed in claim 5, which is characterized in that it is described when getting new match point, it updates The traversal region, and continue traversal queries on the new traversal region, until occurring without next match point Until, comprising:
When getting new match point, by the corresponding coordinate of the last one text in first text and second text Rectangular area between point coordinate points corresponding with the new match point is as new traversal region;
Traversal queries are carried out to the region in addition to the new match point on the new traversal region, until not next Until a match point occurs.
7. text comparative approach as described in claim 1, which is characterized in that described according to first text and described second The match point information of same text is counted in text, obtains text comparison result, comprising:
According to the number of the match point Information Statistics match point of same text in first text and second text;
The word length of first text and second text is obtained, and long according to the smaller text in the word length The number of degree and the match point obtains text comparison result.
8. a kind of text comparison unit, which is characterized in that described device includes:
Text mapping block: being set as obtaining the first text and the second text, by first text and second text point Be not converted into single line text, and by after conversion first text and second text be respectively mapped to X-axis and Y-axis;
Match point enquiry module: it is set as second text progress time to first text in X-axis and in Y-axis Inquiry is gone through, the match point information of same text in first text and second text is obtained;
Text comparison module: be set as according to the match point information of same text in first text and second text into Row statistics, obtains text comparison result.
9. a kind of computer equipment, which is characterized in that the computer equipment includes memory and processor, in the memory It is stored with computer-readable instruction, when the computer-readable instruction is executed by one or more processors, so that one Or multiple processors are executed as described in any one of claims 1 to 7 the step of text comparative approach.
10. a kind of storage medium, which is characterized in that the storage medium can be read and write by processor, and the storage medium is stored with Computer instruction, when the computer-readable instruction is executed by one or more processors, so that one or more processors are held Row is as described in any one of claims 1 to 7 the step of text comparative approach.
CN201910297625.1A 2019-04-15 2019-04-15 Text comparison method, apparatus, computer device and storage medium Active CN110147429B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910297625.1A CN110147429B (en) 2019-04-15 2019-04-15 Text comparison method, apparatus, computer device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910297625.1A CN110147429B (en) 2019-04-15 2019-04-15 Text comparison method, apparatus, computer device and storage medium

Publications (2)

Publication Number Publication Date
CN110147429A true CN110147429A (en) 2019-08-20
CN110147429B CN110147429B (en) 2023-08-15

Family

ID=67588900

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910297625.1A Active CN110147429B (en) 2019-04-15 2019-04-15 Text comparison method, apparatus, computer device and storage medium

Country Status (1)

Country Link
CN (1) CN110147429B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102063510A (en) * 2011-01-17 2011-05-18 珠海全志科技有限公司 Method for searching matched character string
CN106980870A (en) * 2016-12-30 2017-07-25 中国银联股份有限公司 Text matches degree computational methods between short text
CN107085568A (en) * 2017-03-29 2017-08-22 腾讯科技(深圳)有限公司 A kind of text similarity method of discrimination and device
CN107315817A (en) * 2017-06-30 2017-11-03 华自科技股份有限公司 Electronic drawing text matching technique, device, storage medium and computer equipment
CN107679219A (en) * 2017-10-19 2018-02-09 广州视睿电子科技有限公司 Matching process and device, interactive intelligent tablet computer and storage medium
CN108170684A (en) * 2018-01-22 2018-06-15 京东方科技集团股份有限公司 Text similarity computing method and system, data query system and computer product
CN108182222A (en) * 2017-12-26 2018-06-19 东软集团股份有限公司 A kind of text matching technique and device
CN108920580A (en) * 2018-06-25 2018-11-30 腾讯科技(深圳)有限公司 Image matching method, device, storage medium and terminal

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102063510A (en) * 2011-01-17 2011-05-18 珠海全志科技有限公司 Method for searching matched character string
CN106980870A (en) * 2016-12-30 2017-07-25 中国银联股份有限公司 Text matches degree computational methods between short text
CN107085568A (en) * 2017-03-29 2017-08-22 腾讯科技(深圳)有限公司 A kind of text similarity method of discrimination and device
CN107315817A (en) * 2017-06-30 2017-11-03 华自科技股份有限公司 Electronic drawing text matching technique, device, storage medium and computer equipment
CN107679219A (en) * 2017-10-19 2018-02-09 广州视睿电子科技有限公司 Matching process and device, interactive intelligent tablet computer and storage medium
CN108182222A (en) * 2017-12-26 2018-06-19 东软集团股份有限公司 A kind of text matching technique and device
CN108170684A (en) * 2018-01-22 2018-06-15 京东方科技集团股份有限公司 Text similarity computing method and system, data query system and computer product
CN108920580A (en) * 2018-06-25 2018-11-30 腾讯科技(深圳)有限公司 Image matching method, device, storage medium and terminal

Also Published As

Publication number Publication date
CN110147429B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
US11573942B2 (en) System and method for determining exact location results using hash encoding of multi-dimensioned data
Driemel et al. Jaywalking your dog: computing the Fréchet distance with shortcuts
Ta et al. Signature-based trajectory similarity join
Cvitanović et al. Topological and metric properties of Hénon-type strange attractors
Chen et al. A benchmark for evaluating moving object indexes
Lee et al. Scalable skyline computation using a balanced pivot selection technique
US9286312B2 (en) Data coreset compression
US11307049B2 (en) Methods, apparatuses, systems, and storage media for storing and loading visual localization maps
JP6311404B2 (en) Management program, management apparatus, and management method
US20160019248A1 (en) Methods for processing within-distance queries
CN106528790A (en) Method and device for selecting support point in metric space
US20240078255A1 (en) Method and apparatus for determining spatial two-tuple, computer device, and storage medium
Sun et al. On efficient aggregate nearest neighbor query processing in road networks
KR101116663B1 (en) Partitioning Method for High Dimensional Data
Zhou et al. Design and implementation of multi-scale databases
Cho et al. A basis of spatial big data analysis with map-matching system
Sinha LSH vs randomized partition trees: Which one to use for nearest neighbor search?
CN110147429A (en) Text comparative approach, device, computer equipment and storage medium
Doraiswamy et al. Spade: Gpu-powered spatial database engine for commodity hardware
WO2016107440A1 (en) Method and apparatus for generating and displaying an electronic map
CN113297430B (en) Sketch-based high-performance arbitrary partial key measurement method and system
US11449566B2 (en) Methods and systems for processing geospatial data
US11537622B2 (en) K-nearest neighbour spatial queries on a spatial database
CN111130569B (en) Spatial information data self-adaptive fault-tolerant processing method and system
CN110945499B (en) Method and system for real-time three-dimensional space search and point cloud registration by applying dimension shuffling transformation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant