CN116186321A - Big data policy query matching method and system - Google Patents

Big data policy query matching method and system Download PDF

Info

Publication number
CN116186321A
CN116186321A CN202310092534.0A CN202310092534A CN116186321A CN 116186321 A CN116186321 A CN 116186321A CN 202310092534 A CN202310092534 A CN 202310092534A CN 116186321 A CN116186321 A CN 116186321A
Authority
CN
China
Prior art keywords
data
policy
matching
voice file
communication voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310092534.0A
Other languages
Chinese (zh)
Inventor
李逸航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Constantine Information Technology Chongqing Co ltd
Original Assignee
Constantine Information Technology Chongqing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Constantine Information Technology Chongqing Co ltd filed Critical Constantine Information Technology Chongqing Co ltd
Priority to CN202310092534.0A priority Critical patent/CN116186321A/en
Publication of CN116186321A publication Critical patent/CN116186321A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/635Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/535Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/61Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/638Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/65Clustering; Classification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention relates to the technical field of data processing, in particular to a big data policy query matching method and a system thereof, which respectively scan and preprocess acquired business license pictures and communication voice files; extracting keywords from the preprocessed communication voice file, and screening and classifying the communication voice file based on business range categories; the data in the word stock constructed after screening is matched with the data in the policy database to generate a corresponding matching tree, the matching tree is displayed, the information in the conversation process with the enterprise can be queried and matched with all policies, all the satisfied policies are displayed according to the matching degree, the enterprise can select, all the policies meeting the requirements can be guaranteed to be matched, and the supporting strength of the enterprise enjoying the policies is improved.

Description

Big data policy query matching method and system
Technical Field
The invention relates to the technical field of data processing, in particular to a big data policy query matching method and a system thereof.
Background
At present, more and more entrepreneurs are created, the supporting force of governments on small entrepreneurs is increased, various policies of the entrepreneurs are increased, and when the entrepreneurs apply various subsidy policies of the governments, the entrepreneurs need to know in advance and read the whole policies to judge whether the policies meet the application conditions or not, and file data is prepared accordingly. The current enterprise policy inquiry mode is complicated, and usually needs to specially go to related departments of the enterprise policy to perform manual inquiry, so that a great amount of time is consumed, and no method is provided for ensuring that all policies meeting the requirements can be matched, so that the force of enjoying policy support by the enterprise is greatly reduced.
Disclosure of Invention
The invention aims to provide a big data policy query matching method and a big data policy query matching system, which ensure that all policies meeting requirements can be matched and completed, and improve the strength of enjoying policy support of enterprises.
To achieve the above object, in a first aspect, the present invention provides a big data policy query matching method, including the steps of:
scanning and preprocessing the acquired business license picture and the communication voice file respectively;
extracting keywords from the preprocessed communication voice file, and screening and classifying the communication voice file based on business range categories;
and matching the data in the word stock constructed after screening with the data in the policy database, generating a corresponding matching tree, and displaying.
Wherein, scan and precondition business license picture and communication voice file obtained respectively, include:
scanning the business license paper version of the user, and naming and storing the obtained business license picture;
and acquiring real-time data of the communication voice file, and transmitting the data to a server for caching and preprocessing.
The method for acquiring real-time data of the communication voice file, transmitting the data to a server for caching and preprocessing comprises the following steps:
collecting voice data communicated on site in real time, and transmitting the voice data to a server for caching in real time based on a data transmission protocol;
judging tone color information in the cached communication voice file, and cutting and classifying the communication voice file according to tone color change points.
After scanning and preprocessing the acquired business license picture and the communication voice file respectively, the method further comprises the following steps:
and extracting the characters in the operation range in the scanned business license picture, and dividing the extracted operation range based on the separator to obtain a plurality of operation keywords.
Wherein the method further comprises:
and performing impurity removal and filtering treatment on the voice data acquired in real time.
In a second aspect, the present invention provides a big data policy query matching system, adapted for use in a big data policy query matching method as described in the first aspect,
the big data policy query matching system comprises a data acquisition module, a data extraction module and a policy matching module, wherein the data acquisition module, the data extraction module and the policy matching module are connected in sequence;
the data acquisition module is used for respectively scanning and preprocessing the acquired business license picture and the communication voice file;
the data extraction module is used for extracting keywords from the preprocessed communication voice file, and screening and classifying and storing the keywords based on business range categories;
the policy matching module is used for matching the data in the word stock constructed after screening with the data in the policy database, generating a corresponding matching tree and displaying.
According to the big data policy query matching method and the system thereof, the acquired business license picture and the communication voice file are respectively scanned and preprocessed; extracting keywords from the preprocessed communication voice file, and screening and classifying the communication voice file based on business range categories; the data in the word stock constructed after screening is matched with the data in the policy database to generate a corresponding matching tree, the matching tree is displayed, the information in the conversation process with the enterprise can be queried and matched with all policies, all the satisfied policies are displayed according to the matching degree, the enterprise can select, all the policies meeting the requirements can be guaranteed to be matched, and the supporting strength of the enterprise enjoying the policies is improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a big data policy query matching method according to a first embodiment of the present invention.
Fig. 2 is a flowchart of a big data policy query matching method according to a second embodiment of the present invention.
Fig. 3 is a schematic diagram of the same structure of big data policy query matching provided by the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.
Referring to fig. 1, a first embodiment of the present invention provides a big data policy query matching method, which includes the following steps:
s101, scanning and preprocessing the acquired business license picture and the communication voice file respectively.
Specifically, a business license paper version brought by an enterprise is scanned to obtain a corresponding business license picture, or an electronic version of the business license sent by the enterprise is scanned to obtain a corresponding business license picture, and the business license picture is named and stored according to the name of the enterprise.
And extracting the characters in the operation range in the scanned business license picture, and dividing the extracted operation range based on the separator to obtain a plurality of operation keywords. The function of text extraction in the picture already belongs to the prior art, and the business license is also divided into a fixed place and a fixed symbol, so that the extraction of the business keywords in the part belongs to a very simple extraction method, and is not repeated here.
Before the communication voice file is acquired, the consent of the enterprise is solicited in advance and then recording is carried out; after agreeing, voice data communicated on site are collected in real time, the voice data are transmitted to a server in real time based on a data transmission protocol for caching, and after the server receives the transmitted data, impurity removal and filtering processing are performed on the voice data, so that noise in the voice data is removed, and the voice data can be extracted more accurately in the later period of data extraction.
After the filtering is finished, the tone information in the acquired communication voice file is judged, and the tone of the words of different people is different, so that whether the data of the currently speaking person need to be processed or not can be judged only according to the different tone information, the processing capacity of the data can be further reduced, and the processing speed of the data is improved.
In the process of judging tone information, firstly, obtaining a mute segment and a voice segment in communication voice data, judging whether tone between two adjacent voice segments is the same, if so, extracting and caching the second segment communication voice data, then judging whether tone between the first segment voice segment and the third segment voice segment is the same, if not, extracting the third segment voice segment, additionally caching the third segment voice segment, and storing the third segment voice segment and the second segment voice segment in the same area until all voice segment data are completely compared, and sequencing all voice segments in a cache area in which the first segment voice segment is cached according to the time information in the communication voice file to obtain a group of voice segment database. The data in the other voice segment is also sequenced according to the time sequence, so that the voice information of different speakers is segmented, the data of different speakers can be selectively processed, and the processing amount of the data is reduced.
S102, extracting keywords from the preprocessed communication voice file, and screening and classifying and storing the keywords based on business range categories.
Specifically, according to the need, selecting any group of voice information in the cached voice segment database to extract, converting all the voice information into text information, then utilizing the existing extraction model, extracting voice keywords in the text information according to the set extraction rule to obtain a plurality of groups of voice keywords, judging whether the two adjacent words belong to the same word or not based on the voice segment corresponding to each voice keyword, marking the voice keywords based on the time sequence of the current voice segment, simultaneously identifying the voice in the current voice segment, judging the degree of connection of the voice of the word sequence in the current voice segment (namely, dividing the current voice segment into sentences), judging whether the two adjacent words belong to the same word or not according to the time interval between the two adjacent words and the composition relation between the word groups in the existing large data database, judging whether the two adjacent words belong to the existing composition if the time interval between the two adjacent words is larger than the set sentence interval, performing sentence breaking processing between the two words if the two adjacent words do not belong to the existing composition, if the sentence forming is not, not performing sentence breaking processing, judging the voice sequence and packaging the word, and if the sentence forming, and packaging the word, and completing the voice communication, and if the sentence forming the sentence.
And matching each voice keyword with each business keyword, judging whether the current voice keywords belong to the range covered by the current business keywords, if so, constructing a folder by using the business keywords, storing the voice keywords until all the voice keywords are classified, and then storing all the folders to construct a word stock.
In order to protect enterprise data from leakage and enterprise secrets, in the process of storing the constructed folders, the folders are encrypted in the following encryption modes: and performing character conversion on the folder names, and caching the word stock into a blockchain for encryption storage.
And S103, matching the data in the word stock constructed after screening with the data in the policy database, generating a corresponding matching tree, and displaying.
Specifically, data in a word stock is extracted, data in a policy database is matched, in the matching process, first, operation keywords on a folder are initially matched with the data in the policy database, first policy data meeting an operation scope is obtained, then voice keywords in the folder are extracted, matching is conducted again with the first policy data, second policy data meeting requirements is screened out, wherein the situation that more than one group of voice keywords are met exists in the second policy data, therefore, corresponding sentences contained in the voice keywords are matched with the second policy data in the keywords, the matching degree between phrases contained in the sentences and the second policy data is calculated, then a root node is constructed based on the operation keywords, all voice keywords contained in the operation keywords are used as first sub-nodes, each number value is used as a sub-node according to the number value of the matching degree, if the same number value is used as the same sub-node, all the second policies which are matched with the corresponding keywords are matched with the corresponding group of the second policy data, namely, the corresponding policies and the corresponding policies are matched with the corresponding group of the second policy data are visually displayed, and the matching degree of the corresponding policies can be displayed, and the matching degree of the corresponding policies and the corresponding policies are visually displayed, and the matching degree of the corresponding policies are matched are displayed.
Meanwhile, in order to improve the passing rate of enterprises to reporting policies, all matching trees are subjected to data check, child nodes with the same policy title are subjected to highlighting reality with the same color, and business keywords with the same item are also subjected to highlighting display with the same color. Therefore, enterprises can know which policy to choose is higher in passing rate, preferential policies can be enjoyed, and all policies meeting the requirements of the current enterprises can be displayed to users, so that the enterprises can choose more conveniently.
Referring to fig. 2, a second embodiment of the present invention provides a big data policy query matching method, which includes the following steps:
s201, dynamically capturing all obtained policy data and constructing a policy database.
Specifically, crawling all published and available policy data from the internet in real time by utilizing a crawler tool, collecting and caching all acquired policy data, extracting features of titles of all policy data, extracting corresponding application ranges, classifying all policy data according to the application ranges after extracting the application ranges, sequentially extracting a plurality of related policy keywords such as supporting items and supporting contents in each policy data, constructing corresponding policy keyword groups according to the sequence of occurrence in the policy data, packaging the corresponding policy data by utilizing the policy keyword groups, replacing the titles of the original policy data, storing each policy data in the corresponding application ranges according to the release time year, classifying and storing all policy data according to the extracted application ranges, and constructing a policy database.
The constructed policy database can collect all the policy data published on the internet in real time, avoid any omission, ensure that all the policies can be matched when the policies of enterprises are matched as much as possible, ensure that all the policies meeting the requirements can be matched as much as possible, and improve the strength of enjoying the policy support of the enterprises.
The descriptions in the specific implementation manners of S202 to S204 are identical to those in the specific implementation manners of S101 to S103 in the first embodiment of the present invention, and are not repeated here.
Referring to fig. 3, the present invention provides a big data policy query matching system, which includes a data acquisition module 1, a data extraction module 2 and a policy matching module 3, wherein the data acquisition module 1, the data extraction module 2 and the policy matching module 3 are sequentially connected;
the data acquisition module 1 is used for respectively scanning and preprocessing acquired business license pictures and communication voice files;
the data extraction module 2 is used for extracting keywords from the preprocessed communication voice file, and screening and classifying and storing the keywords based on business range categories;
the policy matching module 3 is configured to match the data in the word stock constructed after screening with the data in the policy database, generate a corresponding matching tree, and display the matching tree.
In this embodiment, the data acquisition module 1 is configured to scan and preprocess the acquired business license picture and the communication voice file respectively, and dynamically capture all the acquired policy data to construct a policy database; the data extraction module 2 is used for extracting keywords from the preprocessed communication voice file, and screening and classifying and storing the keywords based on business range categories; the policy matching module 3 is configured to match the data in the word stock constructed after screening with the data in the policy database, generate a corresponding matching tree, and display the matching tree. The detailed transmission flow is consistent with what is described in a big data policy query match provided in the present application, and therefore, is not described in detail herein.
The above disclosure is merely illustrative of the preferred embodiments of the present invention, and it is not intended to limit the scope of the claims, and those skilled in the art will understand that all or part of the procedures for implementing the embodiments described above may be changed in the equivalent manner according to the claims of the present invention, and still fall within the scope of the present invention.

Claims (6)

1. A big data policy query matching method, comprising the steps of:
scanning and preprocessing the acquired business license picture and the communication voice file respectively;
extracting keywords from the preprocessed communication voice file, and screening and classifying the communication voice file based on business range categories;
and matching the data in the word stock constructed after screening with the data in the policy database, generating a corresponding matching tree, and displaying.
2. The big data policy query matching method of claim 1, wherein the scanning and preprocessing of the acquired business license picture and the communication voice file respectively comprises:
scanning the business license paper version of the user, and naming and storing the obtained business license picture;
and acquiring real-time data of the communication voice file, and transmitting the data to a server for caching and preprocessing.
3. The big data policy query matching method of claim 2, wherein the real-time data acquisition of the communication voice file and the transmission to the server for buffering and preprocessing comprises:
collecting voice data communicated on site in real time, and transmitting the voice data to a server for caching in real time based on a data transmission protocol;
judging tone color information in the cached communication voice file, and cutting and classifying the communication voice file according to tone color change points.
4. The big data policy query matching method of claim 1, wherein after scanning and preprocessing the acquired business license picture and the communication voice file, respectively, the method further comprises:
and extracting the characters in the operation range in the scanned business license picture, and dividing the extracted operation range based on the separator to obtain a plurality of operation keywords.
5. The big data policy query matching method of claim 1, wherein said method further comprises:
and performing impurity removal and filtering treatment on the voice data acquired in real time.
6. The big data policy query matching system of claim 1 to claim 5, wherein,
the big data policy query matching system comprises a data acquisition module, a data extraction module and a policy matching module, wherein the data acquisition module, the data extraction module and the policy matching module are connected in sequence;
the data acquisition module is used for respectively scanning and preprocessing the acquired business license picture and the communication voice file;
the data extraction module is used for extracting keywords from the preprocessed communication voice file, and screening and classifying and storing the keywords based on business range categories;
the policy matching module is used for matching the data in the word stock constructed after screening with the data in the policy database, generating a corresponding matching tree and displaying.
CN202310092534.0A 2023-02-09 2023-02-09 Big data policy query matching method and system Pending CN116186321A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310092534.0A CN116186321A (en) 2023-02-09 2023-02-09 Big data policy query matching method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310092534.0A CN116186321A (en) 2023-02-09 2023-02-09 Big data policy query matching method and system

Publications (1)

Publication Number Publication Date
CN116186321A true CN116186321A (en) 2023-05-30

Family

ID=86432205

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310092534.0A Pending CN116186321A (en) 2023-02-09 2023-02-09 Big data policy query matching method and system

Country Status (1)

Country Link
CN (1) CN116186321A (en)

Similar Documents

Publication Publication Date Title
CN107766371B (en) Text information classification method and device
CN110955762B (en) Intelligent question-answering platform
KR101605430B1 (en) SYSTEM AND METHOD FOR BUINDING QAs DATABASE AND SEARCH SYSTEM AND METHOD USING THE SAME
US7664821B1 (en) Systems and methods for determining communication chains based on messages
US8667169B2 (en) System and method for providing argument maps based on activity in a network environment
CN108984650B (en) Computer-readable recording medium and computer device
US20100161604A1 (en) Apparatus and method for multimedia content based manipulation
CN107291780A (en) A kind of user comment information methods of exhibiting and device
WO2012011092A1 (en) System, method and device for intelligent textual conversation system
US9201965B1 (en) System and method for providing speech recognition using personal vocabulary in a network environment
US8099430B2 (en) Computer method and apparatus of information management and navigation
WO2021036439A1 (en) Method for responding to complaint, and device
CN104142936A (en) Audio and video match method and audio and video match device
CN113742496B (en) Electric power knowledge learning system and method based on heterogeneous resource fusion
US11438346B2 (en) Restrict transmission of manipulated content in a networked environment
CN111008285B (en) Author disambiguation method based on thesis key attribute network
CN116186321A (en) Big data policy query matching method and system
CN115496830A (en) Method and device for generating product demand flow chart
CN106777124B (en) Semantic knowledge method, apparatus and system
CN201130994Y (en) Words consultation auto-answer system
US11356474B2 (en) Restrict transmission of manipulated content in a networked environment
JP2001109794A (en) Method for creating business plan using computer and business plan creating system using the method
Rimmer et al. Examining users' repertoire of Internet applications
CN104978403B (en) A kind of generation method and device of video album title
Roussinov et al. Message Sense Maker: engineering a tool set for customer relationship management

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination