CN106815593A - The determination method and apparatus of Chinese text similarity - Google Patents

The determination method and apparatus of Chinese text similarity Download PDF

Info

Publication number
CN106815593A
CN106815593A CN201510850305.6A CN201510850305A CN106815593A CN 106815593 A CN106815593 A CN 106815593A CN 201510850305 A CN201510850305 A CN 201510850305A CN 106815593 A CN106815593 A CN 106815593A
Authority
CN
China
Prior art keywords
phonetic
text
chinese
unit
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510850305.6A
Other languages
Chinese (zh)
Other versions
CN106815593B (en
Inventor
刘粉香
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201510850305.6A priority Critical patent/CN106815593B/en
Publication of CN106815593A publication Critical patent/CN106815593A/en
Application granted granted Critical
Publication of CN106815593B publication Critical patent/CN106815593B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Document Processing Apparatus (AREA)

Abstract

This application discloses a kind of determination method and apparatus of Chinese text similarity.Wherein, the method includes:Chinese character in first Chinese text is converted into phonetic, obtain the first phonetic text, Chinese character in second Chinese text is converted into phonetic, obtain the second phonetic text, according to the number of every kind of phonetic unit in the number and the second phonetic text of every kind of phonetic unit in rule-statistical the first phonetic text of the Chinese phonetic alphabet, first eigenvector is generated by the number of every kind of phonetic unit in the first phonetic text, by the number generation second feature vector of every kind of phonetic unit in the second phonetic text, calculate the distance of first eigenvector and second feature vector, the similarity of the first Chinese text and the second Chinese text is determined according to distance, wherein, apart from smaller, first Chinese text is higher with the similarity of the second Chinese text.Present application addresses the technical problem that prior art is difficult to effectively Similar Text of the identification caused by misspelling.

Description

The determination method and apparatus of Chinese text similarity
Technical field
The application is related to text-processing field, in particular to the determination method and dress of a kind of Chinese text similarity Put.
Background technology
During being analyzed to text, it is often necessary to carry out error correction to text, i.e. appeared in text Mistake word is corrected, such as, according to " the dangerous hand-pulled noodles " of user input, distinguishing the possible target word of user is Similar Text " hand-pulled noodles of taste thousand ".And for the determination method of Similar Text, it is presently mainly similar between calculating character string The number of word, similar number is more, represents that the similarity of text is higher.
However, it is found by the inventors that the scheme of prior art is difficult effectively identification for the Similar Text caused by misspelling, Such as, in its recognition result the similarity ratio " dangerous hand-pulled noodles " of " Chiba hand-pulled noodles " and " hand-pulled noodles of taste thousand " with " taste thousand draws The similarity in face " is higher.
For above-mentioned problem, effective solution is not yet proposed at present.
The content of the invention
The embodiment of the present application provides a kind of determination method and apparatus of Chinese text similarity, at least to solve existing skill Art is difficult to the technical problem of effectively Similar Text of the identification caused by misspelling.
According to the one side of the embodiment of the present application, there is provided a kind of determination method of Chinese text similarity, including: Chinese character in first Chinese text is converted into phonetic, the first phonetic text is obtained, by the Chinese character in the second Chinese text Phonetic is converted into, the second phonetic text is obtained;According to the Chinese phonetic alphabet rule-statistical described in it is every kind of in the first phonetic text The number of every kind of phonetic unit in the number of phonetic unit and the second phonetic text;By in the first phonetic text The number generation first eigenvector of every kind of phonetic unit, by the number of every kind of phonetic unit in the second phonetic text Generation second feature vector;Calculate the distance of the first eigenvector and second feature vector;According to it is described away from From the similarity for determining first Chinese text and second Chinese text, wherein, it is described apart from smaller, it is described First Chinese text is higher with the similarity of second Chinese text.
Further, according to the Chinese phonetic alphabet rule-statistical described in the first phonetic text the number of every kind of phonetic unit and The number of every kind of phonetic unit includes in the second phonetic text:Using an initial consonant in Chinese character as a phonetic list Unit, a simple or compound vowel of a Chinese syllable as a phonetic unit, count every kind of initial consonant and every kind of simple or compound vowel of a Chinese syllable in the first phonetic text The number of every kind of initial consonant and every kind of simple or compound vowel of a Chinese syllable in several and the second phonetic text.
Further, according to the Chinese phonetic alphabet rule-statistical described in the first phonetic text the number of every kind of phonetic unit and The number of every kind of phonetic unit includes in the second phonetic text:An entirety in Chinese character is recognized into pronunciation section as Individual phonetic unit, non-integral recognizes an initial consonant of the Chinese phonetic alphabet of pronunciation section as a phonetic unit, and non-integral recognizes reading One simple or compound vowel of a Chinese syllable of the Chinese phonetic alphabet of syllable as a phonetic unit, count every kind of initial consonant in the first phonetic text, Every kind of simple or compound vowel of a Chinese syllable and every kind of entirety recognize every kind of initial consonant, every kind of simple or compound vowel of a Chinese syllable in the number and the second phonetic text of pronunciation section And every kind of entirety recognizes the number of pronunciation section.
Further, first eigenvector is generated by the number of every kind of phonetic unit in the first phonetic text, by institute The number generation second feature vector for stating every kind of phonetic unit in the second phonetic text includes:By the first phonetic text In every kind of phonetic unit number be inserted respectively into preset vector respective dimensions position, obtain the fisrt feature to Amount, the number of every kind of phonetic unit in the second phonetic text is inserted respectively into the position of the respective dimensions for presetting vector Put, obtain second feature vector, wherein, the default vector be with the phonetic arranged according to preset order The vector of the one-to-one multiple dimension of the species of unit.
Further, calculate the first eigenvector includes with the distance of second feature vector:Calculate described The difference of each corresponding dimension during one characteristic vector is vectorial with the second feature;The difference of each correspondence dimension is taken absolutely To value, and the absolute value is added, obtains the distance.
According to the another aspect of the embodiment of the present application, a kind of determining device of Chinese text similarity is additionally provided, including: Conversion unit, for the Chinese character in the first Chinese text to be converted into phonetic, obtains the first phonetic text, by second Chinese character in text is converted into phonetic, obtains the second phonetic text;Statistic unit, for the rule according to the Chinese phonetic alphabet Then count every kind of phonetic unit in the number and the second phonetic text of every kind of phonetic unit in the first phonetic text Number;Generation unit, for from the first phonetic text every kind of phonetic unit number generation fisrt feature to Amount, by the number generation second feature vector of every kind of phonetic unit in the second phonetic text;Computing unit, is used for Calculate the distance of the first eigenvector and second feature vector;Determining unit, for true according to the distance The similarity of fixed first Chinese text and second Chinese text, wherein, it is described apart from smaller, described first Chinese text is higher with the similarity of second Chinese text.
Further, the statistic unit is specifically for using an initial consonant in Chinese character as a phonetic unit, Simple or compound vowel of a Chinese syllable counts the number and institute of every kind of initial consonant and every kind of simple or compound vowel of a Chinese syllable in the first phonetic text as a phonetic unit State the number of every kind of initial consonant and every kind of simple or compound vowel of a Chinese syllable in the second phonetic text.
Further, the statistic unit using an entirety in Chinese character specifically for recognizing pronunciation section as a phonetic list Unit, non-integral recognizes an initial consonant of the Chinese phonetic alphabet of pronunciation section as a phonetic unit, and non-integral recognizes the Chinese of pronunciation section One simple or compound vowel of a Chinese syllable of language phonetic counts every kind of initial consonant, every kind of simple or compound vowel of a Chinese syllable in the first phonetic text as a phonetic unit And every kind of entirety recognizes every kind of initial consonant in the number and the second phonetic text of pronunciation section, every kind of simple or compound vowel of a Chinese syllable and every kind of Entirety recognizes the number of pronunciation section.
Further, the generation unit is specifically for the number of every kind of phonetic unit in the first phonetic text is divided The position of the respective dimensions for presetting vector is not inserted into, the first eigenvector is obtained, by the second phonetic text In every kind of phonetic unit number be inserted respectively into preset vector respective dimensions position, obtain the second feature to Amount, wherein, the default vector is with many correspondingly with the species of the phonetic unit arranged according to preset order The vector of individual dimension.
Further, the computing unit includes:First computing module, for calculating the first eigenvector and institute State the difference of each correspondence dimension in second feature vector;Second computing module, for by it is described each correspondence dimension difference Take absolute value, and the absolute value is added, obtain the distance.
According to embodiments of the present invention, the Chinese character in the first Chinese text is converted into phonetic, obtains the first phonetic text, Chinese character in second Chinese text is converted into phonetic, the second phonetic text is obtained, according to the rule-statistical of the Chinese phonetic alphabet In first phonetic text in the number of every kind of phonetic unit and the second phonetic text every kind of phonetic unit number, by first The number generation first eigenvector of every kind of phonetic unit in phonetic text, by every kind of phonetic unit in the second phonetic text Number generation second feature vector, calculate the distance of first eigenvector and second feature vector, determined according to distance The similarity of the first Chinese text and the second Chinese text, wherein, apart from smaller, the first Chinese text and the second Chinese The similarity of text is higher, solves the technology that prior art is difficult to effectively Similar Text of the identification caused by misspelling Problem, realizes the identification to the Similar Text caused by misspelling.
Brief description of the drawings
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes the part of the application, this Shen Schematic description and description please does not constitute the improper restriction to the application for explaining the application.In accompanying drawing In:
Fig. 1 is the flow chart of the determination method of the Chinese text similarity according to the embodiment of the present application;
Fig. 2 is the schematic diagram of the determining device of the Chinese text similarity according to the embodiment of the present application.
Specific embodiment
In order that those skilled in the art more fully understand application scheme, below in conjunction with the embodiment of the present application Accompanying drawing, is clearly and completely described to the technical scheme in the embodiment of the present application, it is clear that described embodiment The only embodiment of the application part, rather than whole embodiments.Based on the embodiment in the application, ability The every other embodiment that domain those of ordinary skill is obtained under the premise of creative work is not made, should all belong to The scope of the application protection.
It should be noted that term " first ", " in the description and claims of this application and above-mentioned accompanying drawing Two " it is etc. for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that this The data that sample is used can be exchanged in the appropriate case, so as to embodiments herein described herein can with except Here the order beyond those for illustrating or describing is implemented.Additionally, term " comprising " and " having " and they Any deformation, it is intended that covering is non-exclusive to be included, for example, containing process, the side of series of steps or unit Method, system, product or equipment are not necessarily limited to those steps clearly listed or unit, but may include unclear List or for these processes, method, product or other intrinsic steps of equipment or unit.
According to the embodiment of the present application, there is provided a kind of embodiment of the method for the determination method of Chinese text similarity, it is necessary to Illustrate, can be in the such as one group department of computer science of computer executable instructions the step of the flow of accompanying drawing is illustrated Performed in system, and, although logical order is shown in flow charts, but in some cases, can be with difference Shown or described step is performed in order herein.
Fig. 1 is the flow chart of the determination method of the Chinese text similarity according to the embodiment of the present application, as shown in figure 1, The method comprises the following steps:
Step S102, phonetic is converted into by the Chinese character in the first Chinese text, the first phonetic text is obtained, by second Chinese character in text is converted into phonetic, obtains the second phonetic text.
Wherein, the first Chinese text and the second Chinese text can be article, sentence, phrase etc..First Chinese text This and the second Chinese text are two texts of similarity to be determined.In the present embodiment, by the first Chinese text and second Chinese text changes into phonetic text respectively.Its corresponding phonetic will be changed into by each word in Chinese text, be formed and spelled Sound text.For example, " in high spirits " to be converted into " xing gao cai lie ".
Step S104, according to the number and second of every kind of phonetic unit in rule-statistical the first phonetic text of the Chinese phonetic alphabet The number of every kind of phonetic unit in phonetic text.
The spelling rules of the Chinese phonetic alphabet is that initial consonant is one or more spelling plus simple or compound vowel of a Chinese syllable, the i.e. corresponding phonetic of each Chinese character Sound unit is constituted, wherein it is possible to using initial consonant and simple or compound vowel of a Chinese syllable as phonetic unit.It is overall due to also including in the Chinese phonetic alphabet Recognize pronunciation section, therefore, the entirety recognizes pronunciation section can also be used as phonetic unit.
For example, above-mentioned " xing gao cai lie ", wherein, the phonetic unit for splitting into can be " x ", " ing ", " g ", " ao ", " c ", " ai ", " l ", " ie ", the number of each phonetic unit are 1.Phonetic text " gao gao Xing xing ", " g ", " ao ", " x ", the number of " ing " are 2 after statistics.
Step S106, first eigenvector is generated by the number of every kind of phonetic unit in the first phonetic text, is spelled by second The number generation second feature vector of every kind of phonetic unit in sound text.
After the number of every kind of phonetic unit in counting two phonetic texts, from the number generate corresponding feature to Amount, this feature vector can be the vector for including multiple dimensions, wherein, first eigenvector and second feature are vectorial Number of dimensions is identical.
Alternatively, the generating mode of characteristic vector can be to the species of all of phonetic unit in the current Chinese phonetic alphabet by According to preset order sequence, a dimension of the phonetic unit character pair vector of each species, every kind of spelling in phonetic text The number of sound unit as phonetic unit respective dimensions in characteristic vector value;Can also be two phonetic texts of statistics Appeared in all of phonetic unit species, the characteristic vector of generation and the dimension of species number respective numbers, wherein, The number of the every kind of phonetic unit counted in each phonetic text is used as phase in the corresponding characteristic vector of corresponding phonetic text Answer the value of dimension.For example, " gao gao xing xing " and " gao gao xin xin " two phonetic texts, its In, the species of phonetic unit has " g ", " ao ", " x ", " ing ", " in ", therefore the characteristic vector of generation has 5 Individual dimension, wherein, according to the first phonetic text that above-mentioned sequence (" g ", " ao ", " x ", " ing ", " in ") is generated Characteristic vector (i.e. first eigenvector) be [2,2,2,2,0], (i.e. second is special for the characteristic vector of the second phonetic text Levy vector) it is [2,2,2,0,2].
Step S108, calculates the distance of first eigenvector and second feature vector.
Step S110, the similarity of the first Chinese text and the second Chinese text is determined according to distance, wherein, distance is got over Small, the first Chinese text is higher with the similarity of the second Chinese text.
After generation first eigenvector with second feature vector, the distance between the two vectors are calculated, the distance It can be Euclidean distance etc..Determine the similarity between two Chinese texts further according to the distance for calculating, distance is bigger, The two similarity is smaller, and apart from smaller, similarity therebetween is bigger.For example, " the Chiba hand-pulled noodles " determined Similarity ratio " dangerous hand-pulled noodles " with " hand-pulled noodles of taste thousand " is lower with the similarity of " hand-pulled noodles of taste thousand ", is capable of determining that The Similar Text of the text of misspelling.
According to embodiments of the present invention, the Chinese character in the first Chinese text is converted into phonetic, obtains the first phonetic text, Chinese character in second Chinese text is converted into phonetic, the second phonetic text is obtained, according to the rule-statistical of the Chinese phonetic alphabet In first phonetic text in the number of every kind of phonetic unit and the second phonetic text every kind of phonetic unit number, by first The number generation first eigenvector of every kind of phonetic unit in phonetic text, by every kind of phonetic unit in the second phonetic text Number generation second feature vector, calculate the distance of first eigenvector and second feature vector, determined according to distance The similarity of the first Chinese text and the second Chinese text, wherein, apart from smaller, the first Chinese text and the second Chinese The similarity of text is higher, solves the technology that prior art is difficult to effectively Similar Text of the identification caused by misspelling Problem, realizes the identification to the Similar Text caused by misspelling.
Preferably, spelled according to the number of every kind of phonetic unit in rule-statistical the first phonetic text of the Chinese phonetic alphabet and second The number of every kind of phonetic unit includes in sound text:Using an initial consonant in Chinese character as a phonetic unit, a rhythm Mother counts the number and the second phonetic of every kind of initial consonant and every kind of simple or compound vowel of a Chinese syllable in the first phonetic text as a phonetic unit The number of every kind of initial consonant and every kind of simple or compound vowel of a Chinese syllable in text.
Because the existing Chinese phonetic alphabet uses the Latin alphabet, it is divided into initial consonant and simple or compound vowel of a Chinese syllable, therefore, can in each Chinese character Split into initial consonant and simple or compound vowel of a Chinese syllable (some words then only have simple or compound vowel of a Chinese syllable, such as " love "), in the present embodiment, using each initial consonant as One phonetic unit, each simple or compound vowel of a Chinese syllable as a phonetic unit, by each Chinese character separating in phonetic text into initial consonant and rhythm Mother, and count the number of every kind of initial consonant and every kind of simple or compound vowel of a Chinese syllable.
Alternatively, spelled according to the number of every kind of phonetic unit in rule-statistical the first phonetic text of the Chinese phonetic alphabet and second The number of every kind of phonetic unit includes in sound text:An entirety in Chinese character is recognized into pronunciation section as a phonetic unit, Non-integral recognizes an initial consonant of the Chinese phonetic alphabet of pronunciation section as a phonetic unit, and the Chinese that non-integral recognizes pronunciation section is spelled One simple or compound vowel of a Chinese syllable of sound as a phonetic unit, every kind of initial consonant in the first phonetic text of statistics, every kind of simple or compound vowel of a Chinese syllable and every kind of Integrally recognize every kind of initial consonant, every kind of simple or compound vowel of a Chinese syllable and every kind of entirety in the number and the second phonetic text of pronunciation section and recognize pronunciation section Number.
Due to including one rhythm imperial mother pronunciation of addition in the Chinese phonetic alphabet still as initial consonant (or after one initial consonant of addition Pronunciation is still as simple or compound vowel of a Chinese syllable) syllable, i.e., it is overall to recognize pronunciation section.In the present embodiment, pronunciation section as will be integrally recognized Individual phonetic unit, non-integral recognizes the Chinese phonetic alphabet of pronunciation section, then using initial consonant and simple or compound vowel of a Chinese syllable as phonetic unit, count Go out the number of every kind of phonetic unit.For example, the Chinese phonetic alphabet includes that 23 initial consonants, 24 simple or compound vowel of a Chinese syllable and 16 entirety are recognized Pronunciation section, therefore, phonetic unit has 63 kinds.
Preferably, first eigenvector is generated by the number of every kind of phonetic unit in the first phonetic text, by the second phonetic The number generation second feature vector of every kind of phonetic unit includes in text:By every kind of phonetic unit in the first phonetic text Number be inserted respectively into preset vector respective dimensions position, first eigenvector is obtained, by the second phonetic text In every kind of phonetic unit number be inserted respectively into preset vector respective dimensions position, obtain second feature vector, Wherein, default vector is with the one-to-one multiple dimension of species with the phonetic unit arranged according to preset order Vector.
In the embodiment of the present invention, default each dimension of vector represents a kind of phonetic unit, wherein in generation characteristic vector, The value of each dimension represents the number that the number of times that corresponding phonetic unit occurs in every kind of phonetic text is counted.Its In, all of phonetic unit is ranked up according to preset order, corresponds to each dimension in default vector, and this is preset Order is arbitrarily selected order.
For example, above-mentioned recognize pronunciation section according to initial consonant, simple or compound vowel of a Chinese syllable, entirety in the embodiment for counting phonetic unit, to count two All of initial consonant, simple or compound vowel of a Chinese syllable, the overall number for recognizing pronunciation section, are inserted respectively into the default vector of 63 dimensions in individual phonetic text In, two characteristic vectors of phonetic text are generated, wherein, 63 dimensions are according to being all initial consonants in phonetic, simple or compound vowel of a Chinese syllable, whole Realization pronunciation section number sum is obtained.Phonetic such as " happy " is " gao gao xing xing " statistics " g " " ao " " x " " ing " number respectively is respectively 2, then in 63 Balakrishnan this pronunciation characteristic vectors of " happy " In, corresponding initial consonant and simple or compound vowel of a Chinese syllable position are 2, and other positions are 0, and characteristic vector is [..., 2 ..., 2 ..., 2 ..., 2 ...] (clipped is 0).
In the embodiment of the present application, using default vector is predefined, when characteristic vector is generated, statistics need to only be obtained The number of phonetic unit be inserted into default vector, generating mode is simple.
Preferably, calculate first eigenvector includes with the distance of second feature vector:Calculate first eigenvector and the The difference of each correspondence dimension in two characteristic vectors;The difference of each correspondence dimension is taken absolute value, and absolute value is added, Obtain distance.
Two distances of characteristic vector can be calculated with 1 norm etc., and 1 norm calculation mode is:By two vectors The difference of correspondence position (corresponding to the value of dimension) takes absolute value, and is added, and obtains number and represents two phonetic texts As distance, the number is smaller, represents that similarity is higher.Such as the similarity ratio of " dangerous hand-pulled noodles " and " hand-pulled noodles of taste thousand " The similarity of " Chiba hand-pulled noodles " and " hand-pulled noodles of taste thousand " is higher.
In the embodiment of the present application, the similarity deterministic process of two Chinese texts is converted into the distance between two vectors Judge, improve the accuracy and speed of the identification of Similar Text.
The embodiment of the present application additionally provides a kind of determining device of Chinese text similarity, and the device can be used for performing sheet Apply for the determination method of the Chinese text similarity of embodiment, as shown in Fig. 2 the device includes:Conversion unit 10, Statistic unit 20, generation unit 30, computing unit 40 and determining unit 50.
Conversion unit 10 is used to for the Chinese character in the first Chinese text to be converted into phonetic, obtains the first phonetic text, by the Chinese character in two Chinese texts is converted into phonetic, obtains the second phonetic text.
Wherein, the first Chinese text and the second Chinese text can be article, sentence, phrase etc..First Chinese text This and the second Chinese text are two texts of similarity to be determined.In the present embodiment, by the first Chinese text and second Chinese text changes into phonetic text respectively.Its corresponding phonetic will be changed into by each word in Chinese text, be formed and spelled Sound text.For example, " in high spirits " to be converted into " xing gao cai lie ".
Statistic unit 20 is used for according to the number of every kind of phonetic unit in rule-statistical the first phonetic text of the Chinese phonetic alphabet With the number of every kind of phonetic unit in the second phonetic text.
The spelling rules of the Chinese phonetic alphabet is that initial consonant is one or more spelling plus simple or compound vowel of a Chinese syllable, the i.e. corresponding phonetic of each Chinese character Sound unit is constituted, wherein it is possible to using initial consonant and simple or compound vowel of a Chinese syllable as phonetic unit.It is overall due to also including in the Chinese phonetic alphabet Recognize pronunciation section, therefore, the entirety recognizes pronunciation section can also be used as phonetic unit.
For example, above-mentioned " xing gao cai lie ", wherein, the phonetic unit for splitting into can be " x ", " ing ", " g ", " ao ", " c ", " ai ", " l ", " ie ", the number of each phonetic unit are 1.Phonetic text " gao gao Xing xing ", " g ", " ao ", " x ", the number of " ing " are 2 after statistics.
Generation unit 30 is used to generate first eigenvector by the number of every kind of phonetic unit in the first phonetic text, by the The number generation second feature vector of every kind of phonetic unit in two phonetic texts.
After the number of every kind of phonetic unit in counting two phonetic texts, from the number generate corresponding feature to Amount, this feature vector can be the vector for including multiple dimensions, wherein, first eigenvector and second feature are vectorial Number of dimensions is identical.
Alternatively, the generating mode of characteristic vector can be to the species of all of phonetic unit in the current Chinese phonetic alphabet by According to preset order sequence, a dimension of the phonetic unit character pair vector of each species, every kind of spelling in phonetic text The number of sound unit as phonetic unit respective dimensions in characteristic vector value;Can also be two phonetic texts of statistics Appeared in all of phonetic unit species, the characteristic vector of generation and the dimension of species number respective numbers, wherein, The number of the every kind of phonetic unit counted in each phonetic text is used as phase in the corresponding characteristic vector of corresponding phonetic text Answer the value of dimension.For example, " gao gao xing xing " and " gao gao xin xin " two phonetic texts, its In, the species of phonetic unit has " g ", " ao ", " x ", " ing ", " in ", therefore the characteristic vector of generation has 5 Individual dimension, wherein, according to the first phonetic text that above-mentioned sequence (" g ", " ao ", " x ", " ing ", " in ") is generated Characteristic vector (i.e. first eigenvector) be [2,2,2,2,0], (i.e. second is special for the characteristic vector of the second phonetic text Levy vector) it is [2,2,2,0,2].
Computing unit 40 is used to calculate the distance of first eigenvector and second feature vector.
Determining unit 50 is used to determine according to distance the similarity of the first Chinese text and the second Chinese text, wherein, away from From smaller, the first Chinese text is higher with the similarity of the second Chinese text.
After generation first eigenvector with second feature vector, the distance between the two vectors are calculated, the distance It can be Euclidean distance etc..Determine the similarity between two Chinese texts further according to the distance for calculating, distance is bigger, The two similarity is smaller, and apart from smaller, similarity therebetween is bigger.For example, " the Chiba hand-pulled noodles " determined Similarity ratio " dangerous hand-pulled noodles " with " hand-pulled noodles of taste thousand " is lower with the similarity of " hand-pulled noodles of taste thousand ", is capable of determining that The Similar Text of the text of misspelling.
According to embodiments of the present invention, the Chinese character in the first Chinese text is converted into phonetic, obtains the first phonetic text, Chinese character in second Chinese text is converted into phonetic, the second phonetic text is obtained, according to the rule-statistical of the Chinese phonetic alphabet In first phonetic text in the number of every kind of phonetic unit and the second phonetic text every kind of phonetic unit number, by first The number generation first eigenvector of every kind of phonetic unit in phonetic text, by every kind of phonetic unit in the second phonetic text Number generation second feature vector, calculate the distance of first eigenvector and second feature vector, determined according to distance The similarity of the first Chinese text and the second Chinese text, wherein, apart from smaller, the first Chinese text and the second Chinese The similarity of text is higher, solves the technology that prior art is difficult to effectively Similar Text of the identification caused by misspelling Problem, realizes the identification to the Similar Text caused by misspelling.
Preferably, statistic unit is specifically for using an initial consonant in Chinese character as a phonetic unit, a simple or compound vowel of a Chinese syllable is made It is a phonetic unit, the number and the second phonetic text of every kind of initial consonant and every kind of simple or compound vowel of a Chinese syllable in the first phonetic text of statistics In every kind of initial consonant and every kind of simple or compound vowel of a Chinese syllable number.
Because the existing Chinese phonetic alphabet uses the Latin alphabet, it is divided into initial consonant and simple or compound vowel of a Chinese syllable, therefore, can in each Chinese character Split into initial consonant and simple or compound vowel of a Chinese syllable (some words then only have simple or compound vowel of a Chinese syllable, such as " love "), in the present embodiment, using each initial consonant as One phonetic unit, each simple or compound vowel of a Chinese syllable as a phonetic unit, by each Chinese character separating in phonetic text into initial consonant and rhythm Mother, and count the number of every kind of initial consonant and every kind of simple or compound vowel of a Chinese syllable.
Preferably, statistic unit is non-specifically for an entirety in Chinese character is recognized into pronunciation section as a phonetic unit Entirety recognizes an initial consonant of the Chinese phonetic alphabet of pronunciation section as a phonetic unit, and non-integral recognizes the Chinese phonetic alphabet of pronunciation section A simple or compound vowel of a Chinese syllable as a phonetic unit, every kind of initial consonant in the first phonetic text of statistics, every kind of simple or compound vowel of a Chinese syllable and every kind of whole Every kind of initial consonant, every kind of simple or compound vowel of a Chinese syllable and every kind of entirety recognize pronunciation section in realizing the number and the second phonetic text of pronunciation section Number.
Due to including one rhythm imperial mother pronunciation of addition in the Chinese phonetic alphabet still as initial consonant (or after one initial consonant of addition Pronunciation is still as simple or compound vowel of a Chinese syllable) syllable, i.e., it is overall to recognize pronunciation section.In the present embodiment, pronunciation section as will be integrally recognized Individual phonetic unit, non-integral recognizes the Chinese phonetic alphabet of pronunciation section, then using initial consonant and simple or compound vowel of a Chinese syllable as phonetic unit, count Go out the number of every kind of phonetic unit.For example, the Chinese phonetic alphabet includes that 23 initial consonants, 24 simple or compound vowel of a Chinese syllable and 16 entirety are recognized Pronunciation section, therefore, phonetic unit has 63 kinds.
Preferably, generation unit is pre- specifically for the number of every kind of phonetic unit in the first phonetic text is inserted respectively into If the position of the respective dimensions of vector, obtains first eigenvector, by the second phonetic text every kind of phonetic unit Number is inserted respectively into the position of the respective dimensions for presetting vector, obtains second feature vector, wherein, it is tool to preset vector There is the vector with the one-to-one multiple dimension of the species of the phonetic unit arranged according to preset order.
In the embodiment of the present invention, default each dimension of vector represents a kind of phonetic unit, wherein in generation characteristic vector, The value of each dimension represents the number that the number of times that corresponding phonetic unit occurs in every kind of phonetic text is counted.Its In, all of phonetic unit is ranked up according to preset order, corresponds to each dimension in default vector, and this is preset Order is arbitrarily selected order.
For example, above-mentioned recognize pronunciation section according to initial consonant, simple or compound vowel of a Chinese syllable, entirety in the embodiment for counting phonetic unit, to count two All of initial consonant, simple or compound vowel of a Chinese syllable, the overall number for recognizing pronunciation section, are inserted respectively into the default vector of 63 dimensions in individual phonetic text In, two characteristic vectors of phonetic text are generated, wherein, 63 dimensions are according to being all initial consonants in phonetic, simple or compound vowel of a Chinese syllable, whole Realization pronunciation section number sum is obtained.Phonetic such as " happy " is " gao gao xing xing " statistics " g " " ao " " x " " ing " number respectively is respectively 2, then in 63 Balakrishnan this pronunciation characteristic vectors of " happy " In, corresponding initial consonant and simple or compound vowel of a Chinese syllable position are 2, and other positions are 0, and characteristic vector is [..., 2 ..., 2 ..., 2 ..., 2 ...] (clipped is 0).
In the embodiment of the present application, using default vector is predefined, when characteristic vector is generated, statistics need to only be obtained The number of phonetic unit be inserted into default vector, generating mode is simple.
Preferably, computing unit includes:First computing module, for calculating first eigenvector with second feature vector In each correspondence dimension difference;Second computing module, for the difference of each correspondence dimension to be taken absolute value, and will be absolute Value is added, and obtains distance.
Two distances of characteristic vector can be calculated with 1 norm etc., and 1 norm calculation mode is:By two vectors The difference of correspondence position (corresponding to the value of dimension) takes absolute value, and is added, and obtains number and represents two phonetic texts As distance, the number is smaller, represents that similarity is higher.Such as the similarity ratio of " dangerous hand-pulled noodles " and " hand-pulled noodles of taste thousand " The similarity of " Chiba hand-pulled noodles " and " hand-pulled noodles of taste thousand " is higher.
In the embodiment of the present application, the similarity deterministic process of two Chinese texts is converted into the distance between two vectors Judge, improve the accuracy and speed of the identification of Similar Text.
The determining device of the Chinese text similarity includes processor and memory, and above-mentioned conversion unit 10, statistics are single Unit 20, generation unit 30, computing unit 40 and determining unit 50 etc. are stored in memory as program unit, By computing device storage said procedure unit in memory.It is above-mentioned to may be stored in memory.
Kernel is included in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can set one Or more, the similarity of content of text is determined by adjusting kernel parameter.
Memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM) and/ Or the form, such as read-only storage (ROM) or flash memory (flash RAM) such as Nonvolatile memory, memory includes at least one Individual storage chip.
Present invention also provides a kind of embodiment of computer program product, when being performed on data processing equipment, fit In the program code for performing initialization there are as below methods step:Chinese character in first Chinese text is converted into phonetic, is obtained To the first phonetic text, the Chinese character in the second Chinese text is converted into phonetic, the second phonetic text is obtained, according to the Chinese Every kind of phonetic list in the number and the second phonetic text of every kind of phonetic unit in rule-statistical the first phonetic text of language phonetic The number of unit, first eigenvector is generated by the number of every kind of phonetic unit in the first phonetic text, by the second phonetic text The number generation second feature vector of every kind of phonetic unit in this, calculate first eigenvector and second feature vector away from From, the similarity of the first Chinese text and the second Chinese text is determined according to distance, wherein, apart from smaller, in first Text is higher with the similarity of the second Chinese text.
Above-mentioned the embodiment of the present application sequence number is for illustration only, and the quality of embodiment is not represented.
In above-described embodiment of the application, the description to each embodiment all emphasizes particularly on different fields, and does not have in certain embodiment The part of detailed description, may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents, can be by other Mode realize.Wherein, device embodiment described above is only schematical, such as division of described unit, Can be a kind of division of logic function, there can be other dividing mode when actually realizing, for example multiple units or component Can combine or be desirably integrated into another system, or some features can be ignored, or do not perform.It is another, institute Display or the coupling each other for discussing or direct-coupling or communication connection can be by some interfaces, unit or mould The INDIRECT COUPLING of block or communication connection, can be electrical or other forms.
The unit that is illustrated as separating component can be or may not be it is physically separate, it is aobvious as unit The part for showing can be or may not be physical location, you can with positioned at a place, or can also be distributed to On multiple units.Some or all of unit therein can be according to the actual needs selected to realize this embodiment scheme Purpose.
In addition, during each functional unit in the application each embodiment can be integrated in a processing unit, it is also possible to It is that unit is individually physically present, it is also possible to which two or more units are integrated in a unit.It is above-mentioned integrated Unit can both be realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.
If the integrated unit is to realize in the form of SFU software functional unit and as independent production marketing or when using, Can store in a computer read/write memory medium.Based on such understanding, the technical scheme essence of the application On all or part of the part that is contributed to prior art in other words or the technical scheme can be with software product Form is embodied, and the computer software product is stored in a storage medium, including some instructions are used to so that one Platform computer equipment (can be personal computer, server or network equipment etc.) performs each embodiment institute of the application State all or part of step of method.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD Etc. it is various can be with the medium of store program codes.
The above is only the preferred embodiment of the application, it is noted that for the ordinary skill people of the art For member, on the premise of the application principle is not departed from, some improvements and modifications can also be made, these improve and moisten Decorations also should be regarded as the protection domain of the application.

Claims (10)

1. a kind of determination method of Chinese text similarity, it is characterised in that including:
Chinese character in first Chinese text is converted into phonetic, the first phonetic text is obtained, by the second Chinese text In Chinese character be converted into phonetic, obtain the second phonetic text;
According to the Chinese phonetic alphabet rule-statistical described in the first phonetic text every kind of phonetic unit number and described The number of every kind of phonetic unit in two phonetic texts;
First eigenvector is generated by the number of every kind of phonetic unit in the first phonetic text, by described second The number generation second feature vector of every kind of phonetic unit in phonetic text;
Calculate the distance of the first eigenvector and second feature vector;
The similarity of first Chinese text and second Chinese text is determined according to the distance, wherein, It is described apart from smaller, first Chinese text is higher with the similarity of second Chinese text.
2. method according to claim 1, it is characterised in that according to the Chinese phonetic alphabet rule-statistical described in first spell The number of every kind of phonetic unit includes in the number of every kind of phonetic unit and the second phonetic text in sound text:
Using an initial consonant in Chinese character as a phonetic unit, a simple or compound vowel of a Chinese syllable is used as a phonetic unit, statistics Every kind of sound in the number and the second phonetic text of every kind of initial consonant and every kind of simple or compound vowel of a Chinese syllable in the first phonetic text The number of female and every kind of simple or compound vowel of a Chinese syllable.
3. method according to claim 1, it is characterised in that according to the Chinese phonetic alphabet rule-statistical described in first spell The number of every kind of phonetic unit includes in the number of every kind of phonetic unit and the second phonetic text in sound text:
An entirety in Chinese character is recognized pronunciation section as a phonetic unit, the Chinese that non-integral recognizes pronunciation section is spelled Used as a phonetic unit, non-integral recognizes a simple or compound vowel of a Chinese syllable of the Chinese phonetic alphabet of pronunciation section as one to one initial consonant of sound Individual phonetic unit, every kind of initial consonant, every kind of simple or compound vowel of a Chinese syllable and every kind of entirety recognize pronunciation in counting the first phonetic text Every kind of initial consonant, every kind of simple or compound vowel of a Chinese syllable and every kind of entirety recognize pronunciation section in the number of section and the second phonetic text Number.
4. according to the method in any one of claims 1 to 3, it is characterised in that by the first phonetic text The number generation first eigenvector of every kind of phonetic unit, by every kind of phonetic unit in the second phonetic text Number generation second feature vector includes:
The number of every kind of phonetic unit in the first phonetic text is inserted respectively into the respective dimensions for presetting vector Position, the first eigenvector is obtained, by the number of every kind of phonetic unit in the second phonetic text point The position of the respective dimensions for presetting vector is not inserted into, obtains the second feature vector, wherein, it is described default Vector is the vector with the one-to-one multiple dimension of species with the phonetic unit arranged according to preset order.
5. method according to claim 1, it is characterised in that calculate the first eigenvector and described second special The distance for levying vector includes:
Calculate the difference of the first eigenvector and each corresponding dimension in second feature vector;
The difference of each correspondence dimension is taken absolute value, and the absolute value is added, obtain the distance.
6. a kind of determining device of Chinese text similarity, it is characterised in that including:
Conversion unit, for the Chinese character in the first Chinese text to be converted into phonetic, obtains the first phonetic text, Chinese character in second Chinese text is converted into phonetic, the second phonetic text is obtained;
Statistic unit, for every kind of phonetic unit in the first phonetic text described in the rule-statistical according to the Chinese phonetic alphabet Number and the second phonetic text in every kind of phonetic unit number;
Generation unit, for from the first phonetic text every kind of phonetic unit number generation fisrt feature to Amount, by the number generation second feature vector of every kind of phonetic unit in the second phonetic text;
Computing unit, the distance for calculating the first eigenvector and second feature vector;
Determining unit, for determining first Chinese text and second Chinese text according to the distance Similarity, wherein, described apart from smaller, the similarity of first Chinese text and second Chinese text It is higher.
7. device according to claim 6, it is characterised in that the statistic unit is specifically for by Chinese character Used as a phonetic unit, a simple or compound vowel of a Chinese syllable counts the first phonetic text to individual initial consonant as a phonetic unit In every kind of initial consonant and every kind of simple or compound vowel of a Chinese syllable number and the second phonetic text in every kind of initial consonant and every kind of simple or compound vowel of a Chinese syllable Number.
8. device according to claim 6, it is characterised in that the statistic unit is specifically for by Chinese character Individual entirety recognizes pronunciation section as a phonetic unit, and non-integral recognizes an initial consonant conduct of the Chinese phonetic alphabet of pronunciation section One phonetic unit, non-integral recognizes a simple or compound vowel of a Chinese syllable of the Chinese phonetic alphabet of pronunciation section as a phonetic unit, statistics Every kind of initial consonant, every kind of simple or compound vowel of a Chinese syllable and every kind of entirety recognize the number of pronunciation section and described in the first phonetic text Every kind of initial consonant, every kind of simple or compound vowel of a Chinese syllable and every kind of entirety recognize the number of pronunciation section in second phonetic text.
9. the device according to any one of claim 6 to 8, it is characterised in that the generation unit specifically for The number of every kind of phonetic unit in the first phonetic text is inserted respectively into the position of the respective dimensions for presetting vector Put, obtain the first eigenvector, the number of every kind of phonetic unit in the second phonetic text is inserted respectively Enter the position of the respective dimensions to default vector, obtain the second feature vector, wherein, the default vector It is the vector with the one-to-one multiple dimension of species with the phonetic unit arranged according to preset order.
10. device according to claim 6, it is characterised in that the computing unit includes:
First computing module, for calculating the first eigenvector and the second feature vector in each is corresponding The difference of dimension;
Second computing module, for difference of each correspondence dimension to be taken absolute value, and by the absolute value phase Plus, obtain the distance.
CN201510850305.6A 2015-11-27 2015-11-27 Method and device for determining similarity of Chinese texts Active CN106815593B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510850305.6A CN106815593B (en) 2015-11-27 2015-11-27 Method and device for determining similarity of Chinese texts

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510850305.6A CN106815593B (en) 2015-11-27 2015-11-27 Method and device for determining similarity of Chinese texts

Publications (2)

Publication Number Publication Date
CN106815593A true CN106815593A (en) 2017-06-09
CN106815593B CN106815593B (en) 2019-12-10

Family

ID=59155413

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510850305.6A Active CN106815593B (en) 2015-11-27 2015-11-27 Method and device for determining similarity of Chinese texts

Country Status (1)

Country Link
CN (1) CN106815593B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729300A (en) * 2017-09-18 2018-02-23 百度在线网络技术(北京)有限公司 Processing method, device, equipment and the computer-readable storage medium of text similarity
CN108319978A (en) * 2018-02-01 2018-07-24 北京捷通华声科技股份有限公司 A kind of semantic similarity calculation method and device
CN109299726A (en) * 2018-08-01 2019-02-01 昆明理工大学 A kind of Chinese character pattern Similarity algorithm based on feature vector and stroke order coding
CN109741749A (en) * 2018-04-19 2019-05-10 北京字节跳动网络技术有限公司 A kind of method and terminal device of speech recognition

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102122298A (en) * 2011-03-07 2011-07-13 清华大学 Method for matching Chinese similarity
CN102184195A (en) * 2011-04-20 2011-09-14 北京百度网讯科技有限公司 Method, device and device for acquiring similarity between character strings
CN102214238A (en) * 2011-07-01 2011-10-12 临沂大学 Device and method for matching similarity of Chinese words
CN102332012A (en) * 2011-09-13 2012-01-25 南方报业传媒集团 Chinese text sorting method based on correlation study between sorts
CN103207905A (en) * 2013-03-28 2013-07-17 大连理工大学 Method for calculating text similarity based on target text
CN103605694A (en) * 2013-11-04 2014-02-26 北京奇虎科技有限公司 Device and method for detecting similar texts
WO2014087703A1 (en) * 2012-12-06 2014-06-12 楽天株式会社 Word division device, word division method, and word division program

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102122298A (en) * 2011-03-07 2011-07-13 清华大学 Method for matching Chinese similarity
CN102184195A (en) * 2011-04-20 2011-09-14 北京百度网讯科技有限公司 Method, device and device for acquiring similarity between character strings
CN102214238A (en) * 2011-07-01 2011-10-12 临沂大学 Device and method for matching similarity of Chinese words
CN102332012A (en) * 2011-09-13 2012-01-25 南方报业传媒集团 Chinese text sorting method based on correlation study between sorts
WO2014087703A1 (en) * 2012-12-06 2014-06-12 楽天株式会社 Word division device, word division method, and word division program
CN103207905A (en) * 2013-03-28 2013-07-17 大连理工大学 Method for calculating text similarity based on target text
CN103605694A (en) * 2013-11-04 2014-02-26 北京奇虎科技有限公司 Device and method for detecting similar texts

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729300A (en) * 2017-09-18 2018-02-23 百度在线网络技术(北京)有限公司 Processing method, device, equipment and the computer-readable storage medium of text similarity
CN108319978A (en) * 2018-02-01 2018-07-24 北京捷通华声科技股份有限公司 A kind of semantic similarity calculation method and device
CN109741749A (en) * 2018-04-19 2019-05-10 北京字节跳动网络技术有限公司 A kind of method and terminal device of speech recognition
CN109741749B (en) * 2018-04-19 2020-03-27 北京字节跳动网络技术有限公司 Voice recognition method and terminal equipment
CN109299726A (en) * 2018-08-01 2019-02-01 昆明理工大学 A kind of Chinese character pattern Similarity algorithm based on feature vector and stroke order coding

Also Published As

Publication number Publication date
CN106815593B (en) 2019-12-10

Similar Documents

Publication Publication Date Title
CN106815197A (en) The determination method and apparatus of text similarity
US10714089B2 (en) Speech recognition method and device based on a similarity of a word and N other similar words and similarity of the word and other words in its sentence
CN111639489A (en) Chinese text error correction system, method, device and computer readable storage medium
WO2015192734A1 (en) Information processing method and apparatus
KR101715118B1 (en) Deep Learning Encoding Device and Method for Sentiment Classification of Document
Bakliwal et al. Towards Enhanced Opinion Classification using NLP Techniques.
CN108170680A (en) Keyword recognition method, terminal device and storage medium based on Hidden Markov Model
CN111125354A (en) Text classification method and device
CN111274367A (en) Semantic analysis method, semantic analysis system and non-transitory computer readable medium
KR101633556B1 (en) Apparatus for grammatical error correction and method using the same
CN106815593A (en) The determination method and apparatus of Chinese text similarity
CN111324698B (en) Deep learning method, evaluation viewpoint extraction method, device and system
CN113255331B (en) Text error correction method, device and storage medium
CN103324621A (en) Method and device for correcting spelling of Thai texts
CN113094478B (en) Expression reply method, device, equipment and storage medium
CN110968697A (en) Text classification method, device and equipment and readable storage medium
US10331789B2 (en) Semantic analysis apparatus, method, and non-transitory computer readable storage medium thereof
CN107797981B (en) Target text recognition method and device
WO2020199590A1 (en) Mood detection analysis method and related device
CN107783958B (en) Target statement identification method and device
CN113919424A (en) Training of text processing model, text processing method, device, equipment and medium
CN110136699A (en) A kind of intension recognizing method based on text similarity
CN108090044B (en) Contact information identification method and device
CN108470065A (en) A kind of determination method and device of exception comment text
CN108304366B (en) Hypernym detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Applicant after: Beijing Guoshuang Technology Co.,Ltd.

Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing

Applicant before: Beijing Guoshuang Technology Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant