CN108271041B - Method and device for processing messy codes - Google Patents

Method and device for processing messy codes Download PDF

Info

Publication number
CN108271041B
CN108271041B CN201611264769.XA CN201611264769A CN108271041B CN 108271041 B CN108271041 B CN 108271041B CN 201611264769 A CN201611264769 A CN 201611264769A CN 108271041 B CN108271041 B CN 108271041B
Authority
CN
China
Prior art keywords
corresponding relation
text data
messy
codes
code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611264769.XA
Other languages
Chinese (zh)
Other versions
CN108271041A (en
Inventor
焦张波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201611264769.XA priority Critical patent/CN108271041B/en
Publication of CN108271041A publication Critical patent/CN108271041A/en
Application granted granted Critical
Publication of CN108271041B publication Critical patent/CN108271041B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/61Network physical structure; Signal processing
    • H04N21/6156Network physical structure; Signal processing specially adapted to the upstream path of the transmission network
    • H04N21/6175Network physical structure; Signal processing specially adapted to the upstream path of the transmission network involving transmission via Internet

Abstract

The embodiment of the invention discloses a messy code processing method and a messy code processing device, which are used for conveniently eliminating messy codes of text data. The method provided by the embodiment of the invention comprises the following steps: acquiring text data; judging whether the text data comprises messy codes in a pre-established corresponding relation, wherein the corresponding relation comprises the messy codes and the corresponding relation of operation; and if the text data comprises messy codes in the corresponding relation, processing the text data by using the operation corresponding to the messy codes according to the corresponding relation so as to eliminate the messy codes in the text data. Therefore, the text data comprising the messy codes is compared with the messy codes of the corresponding relation, and if the text data comprises the messy codes of the corresponding relation, the messy codes can be eliminated from the text data by using the operation of the corresponding relation.

Description

Method and device for processing messy codes
Technical Field
The present invention relates to the field of data processing, and in particular, to a method and an apparatus for processing a scrambled code.
Background
In text data acquired through equipment, messy codes often appear in the acquired data due to problems in the acquisition process or the reason of the equipment.
For example, in IPTV data processing, since the data source may be acquired by a device, the obtained data may be scrambled, as shown in table one below:
table one:
channel with a plurality of channels Number of people watching Duration of viewing
China satellite TV ~ E 1000 2300
Center # # channel A 2000 4000
(Huaxia)% @ Wei Shi 3000 5300
xx4486e 100 5000
When the user faces messy codes in the text data, developers or operation and maintenance personnel are often involved to optimize and improve equipment or algorithms for data acquisition, and the solution usually takes more time and is more troublesome.
Disclosure of Invention
The embodiment of the invention provides a messy code processing method and a messy code processing device, which are used for conveniently eliminating messy codes of text data.
In order to solve the above technical problem, an embodiment of the present invention provides the following technical solutions:
a method of scrambling code, comprising:
acquiring text data;
judging whether the text data comprises messy codes in a pre-established corresponding relation, wherein the corresponding relation comprises the messy codes and the corresponding relation of operation;
and if the text data comprises messy codes in the corresponding relation, processing the text data by using the operation corresponding to the messy codes in the text data according to the corresponding relation so as to eliminate the messy codes in the text data.
In order to solve the above technical problem, an embodiment of the present invention further provides the following technical solutions:
an apparatus for scrambling code, comprising:
an acquisition unit configured to acquire text data;
the judging unit is used for judging whether the text data comprises messy codes in a pre-established corresponding relation, and the corresponding relation comprises the messy codes and the corresponding relation of operation;
and the processing unit is used for processing the text data by using the operation corresponding to the messy codes according to the corresponding relation if the text data comprises the messy codes in the corresponding relation so as to eliminate the messy codes in the text data.
According to the technical scheme, the embodiment of the invention has the following advantages:
after the text data is acquired, whether the text data comprises messy codes in a pre-established corresponding relation is judged, and the corresponding relation comprises the messy codes and the corresponding relation of operation. If the text data comprises messy codes in the corresponding relation, the text data is processed by using the operation corresponding to the messy codes according to the corresponding relation, so that the messy codes in the text data can be eliminated. Therefore, the text data comprising the messy codes is compared with the messy codes of the corresponding relation, and if the text data comprises the messy codes of the corresponding relation, the messy codes can be eliminated from the text data by using the operation of the corresponding relation.
Drawings
Fig. 1 is a flowchart of a method for handling a scrambled code according to an embodiment of the present invention;
fig. 2 is a flowchart of a method for scrambling code processing according to another embodiment of the present invention;
fig. 3 is a schematic structural diagram of a scrambling code processing apparatus according to another embodiment of the present invention.
Detailed Description
The embodiment of the invention provides a messy code processing method and a messy code processing device, which are used for conveniently eliminating messy codes of text data.
Fig. 1 is a flowchart of a method for handling a scrambled code according to an embodiment of the present invention. Referring to fig. 1, the method of the embodiment of the present invention includes:
step 101: acquiring text data;
step 102: judging whether the text data comprises messy codes in a pre-established corresponding relation, wherein the corresponding relation comprises the messy codes and the corresponding relation of operation; if the text data includes the messy codes in the corresponding relationship, step 103 is executed.
Step 103: and processing the text data by using the operation corresponding to the messy codes according to the corresponding relation so as to eliminate the messy codes in the text data.
Alternatively,
the corresponding relation comprises at least two types, and the corresponding relations of different types correspond to different types of operations and different levels of priority;
judging whether the text data comprises messy codes in the pre-established corresponding relation, comprising the following steps:
and judging whether the text data comprises messy codes in the corresponding relations or not by using the corresponding relations of different types from first to last according to the level sequence of the priority.
Alternatively,
the corresponding relations comprise a first corresponding relation, a second corresponding relation, a third corresponding relation and a fourth corresponding relation,
the first corresponding relation comprises a first messy code and a corresponding relation of a first operation, the first operation is to replace the text data to be processed comprising the first messy code with a first text, and the first messy code comprises all characters in the text data to be processed;
the second corresponding relation comprises a corresponding relation between second messy codes and a second operation, the second operation is to replace the text data to be processed comprising the second messy codes with a second text, and the second messy codes are partial characters in the text data to be processed;
the third corresponding relation comprises a corresponding relation between a third messy code and a third operation, and the third operation is to delete the third messy code from the text data to be processed;
the fourth corresponding relation comprises a fourth messy code and a fourth operation, and the fourth operation is used for hiding the text data to be processed comprising the fourth messy code.
Alternatively,
the priority levels of the corresponding relations are as follows from high to low in sequence: a first correspondence, a second correspondence, a third correspondence, and a fourth correspondence.
Alternatively,
the number of the corresponding relations comprises a plurality of corresponding relations, each corresponding relation also comprises a first preset index, the text data also comprises a second preset index,
before judging whether the text data comprises messy codes in the pre-established corresponding relation, the method of the embodiment of the invention further comprises the following steps:
determining a target corresponding relation corresponding to a first preset index and a second preset index from the corresponding relations;
judging whether the text data comprises messy codes in the pre-established corresponding relation, comprising the following steps:
and judging whether the text data comprises messy codes in the target corresponding relation.
Alternatively,
the first preset index is the first establishment time of the corresponding relation, and the second preset index is the second establishment time of the text data.
Alternatively,
the number of the corresponding relations comprises a plurality of the corresponding relations, each corresponding relation also comprises a user name,
before judging whether the text data comprises messy codes in the pre-established corresponding relation, the method of the embodiment of the invention further comprises the following steps:
acquiring a user name of a current operation user;
determining the corresponding relation between the user name and the user name of the current operation user from the corresponding relations;
judging whether the text data comprises messy codes in the pre-established corresponding relation, comprising the following steps:
and judging whether the text data comprises messy codes in the determined corresponding relation.
In summary, after the text data is acquired, it is determined whether the text data includes a messy code in a pre-established correspondence relationship, where the correspondence relationship includes a correspondence relationship between the messy code and the operation. If the text data comprises messy codes in the corresponding relation, the text data is processed by using the operation corresponding to the messy codes according to the corresponding relation, so that the messy codes in the text data can be eliminated. Therefore, the text data comprising the messy codes is compared with the messy codes of the corresponding relation, and if the text data comprises the messy codes of the corresponding relation, the messy codes can be eliminated from the text data by using the operation of the corresponding relation.
Fig. 2 is a scrambling code processing method according to an embodiment of the present invention. With reference to the above, and with reference to fig. 2, the embodiment shown in fig. 2 will now be described.
Before describing the flow of the method of the embodiment shown in fig. 2, the corresponding relationship used in the method of the embodiment of the present invention is described for paving.
In the method of the embodiment of the invention, in order to eliminate messy codes in the acquired text data, a corresponding relation is required, the corresponding relation comprises a plurality of dimensions, and the corresponding relation mainly comprises the corresponding relation between the messy codes and operation. The messy codes of the corresponding relation are used for matching with characters in the text data, whether the matching is the same or not is judged, if the matching is the same, the operation corresponding to the messy codes in the corresponding relation is executed, and the operation comprises but is not limited to replacing the text data into a preset text, deleting the messy codes, hiding the text data comprising the messy codes and the like. In order to replace the text data with a preset text, the corresponding relation further comprises a preset text dimension corresponding to the messy codes.
It is understood that the dimension of the correspondence may also include various information, such as setup time, user name, and the like.
In some embodiments of the present invention, priority levels may be assigned to different corresponding relationships, and the order of use of the different corresponding relationships may be determined according to the different priorities.
That is, the correspondence includes at least two types, and the different types of correspondence correspond to different types of operations and different levels of priority. And selecting the corresponding relations of different types one by one according to the level sequence of the priorities, and executing the subsequent step of judging whether the text data comprises messy codes in the pre-established corresponding relations or not according to the selected corresponding relations.
As to the specific case of the correspondence relationship, for example, the following may be made:
the corresponding relations include a first corresponding relation, a second corresponding relation, a third corresponding relation and a fourth corresponding relation.
1.1 the first correspondence includes a correspondence of the first scrambling code and the first operation. The first operation is to replace the text data to be processed including the first messy codes with the first text, wherein the first messy codes include all characters in the text data to be processed. If the text data comprises the first messy code and all characters of the text data are the first messy code, namely the text data is the first messy code, the first text is used for replacing the text data according to the first operation.
1.2 the second corresponding relation comprises a corresponding relation between a second messy code and a second operation, the second operation is to replace the text data to be processed comprising the second messy code with a second text, and the second messy code is a part of characters in the text data to be processed. That is, the first corresponding relationship includes a corresponding relationship of the second messy code, the second text and the second operation, and if the text data includes the second messy code, as long as the second messy code belongs to a part of characters in the text data, the text data can be replaced with the second text according to the second operation. The first corresponding relationship and the second corresponding relationship are different in that the first messy code is identical to the text data to be processed, and the second messy code belongs to partial data in the text data to be processed.
1.3 the third corresponding relation comprises the corresponding relation between the third messy code and the third operation, and the third operation is to delete the third messy code from the text data to be processed. That is, if the text data includes the third scramble code, the third scramble code is deleted from the text data according to the third operation.
1.4 the fourth corresponding relation includes the corresponding relation between the fourth messy code and the fourth operation, and the fourth operation is to hide the text data to be processed including the fourth messy code. That is, if the text data includes the fourth messy code, the entire text data is hidden according to the fourth operation, that is, the text data including the fourth messy code is not displayed when being displayed.
In the embodiment of setting the priority, the priority levels of the correspondence relations are, from high to low: a first correspondence, a second correspondence, a third correspondence, and a fourth correspondence.
In the embodiment of the present invention, the corresponding relationship is pre-established, and may be established on the processing device by the user, or may be obtained by downloading from the server, and the obtaining source of the corresponding relationship is not specifically limited in the embodiment of the present invention.
In order to more intuitively explain the above, the following description will be made in terms of the correspondence relationship established by the user in the IPTV field. The processing device is a device for executing the method of the embodiment of the present invention, and may be a computer or other devices.
And the ZusanIII user uses the user name ZusanIII to log in the relation table of the processing equipment to establish an interface. In some embodiments, the user Zhang III can also log in the relation table to establish an interface by using a public option. If a specific user name is used for login, the established corresponding relation marks the user name, and then the corresponding relation can only be used by the user of the user name; if the public option is used, the public mark on the established corresponding relation mark indicates that the corresponding relation can be used by the user of the local computer.
And then, the processing device is operated for three times to execute data query operation, and the query result is shown as a table, wherein the data of the channel dimension is one column in the table, and the channel data is the text data above. For simplicity of description, the above-mentioned table one is now used as the query result, i.e.
Watch 1
Channel with a plurality of channels Number of people watching Duration of viewing
China satellite TV ~ E 1000 2300
Center # # channel A 2000 4000
'Huaxia' health vision 3000 5300
xx4486e 100 5000
The data of the query result is collected by the device, and a messy code sometimes appears, for example, the channel data of the table two has the messy code, so that the user executes the following operation to eliminate the messy code and establish the corresponding relation to be used subsequently.
2.1 establishing an accurate replacement correspondence-the first correspondence
The user opens the tee and selects all data of the China satellite television to replace the China satellite television to the China satellite television in the second table, and the current processing time is 2016, 12, 11, and 13:00, so that a corresponding relation is established on the processing equipment to obtain data, as shown in the following table A1.
Table a 1.1:
first row Second column Third column Fourth column
'Huaxiawei vision-' 'Huaxia Wei Shi' 2016121113:00 Zhang three
If Zhangsan inquires new data on the next day, the inquiry result shows 'Hua Xian Wei Zi-' again, so as to select all data of 'Hua Wei Zi-' and replace the 'Hua Wei Zi-' with 'Hua Wei Zi A', and the current processing time is 10:00 at 12 months and 12 months in 2016, thus a corresponding relationship is established on the processing equipment, and another piece of data is obtained, as shown in the following table A2.
Table a 1.2:
first row Second column Third column Fourth column
China satellite TV ~ E China satellite television A 2016121113:00-2016121210:00 Zhang three
The two pieces of data in 2.1, i.e. the two corresponding relations, correspond to the first corresponding relation in 1.1, and the specific value of the corresponding relation can be represented by using a table, for example, table a1.1, where there are four columns, the first column is a messy code, the second column is a first text of the mapping, the third column is a time period, which represents the time period for which the record is valid, and the fourth column is a user name.
The first operation may be recorded in the form of a code, or the identification information of the first operation is added to the corresponding relationship, and the device may execute the first operation only when reading the identification information subsequently, or when saving the first relationship, save the first relationship in a storage unit that saves the corresponding relationship of the first operation type, and the first operation in the first corresponding relationship may be established by the above specific manner.
2.2 establishing a fuzzy replacement corresponding relation-a second corresponding relation
The user yei selects "center" in the "center # # channel" of the channel data in the second table, controls the processing device to select the fuzzy replacement option, replaces the "center # # channel a" with "central satellite view", and establishes a corresponding relationship on the processing device when the current time is 2016, 12, 11, 40, so as to obtain the following data:
table a 2:
first row Second column Third column Fourth column
Center (C) Central satellite television 2016121113:40 Zhang three
The data in 2.2 above, i.e. the correspondence, corresponds to the second correspondence in 1.2 above, and the specific value of the correspondence can be represented using a table, for example, table a2 above, where there are four columns, the first column is a messy code, the second column is the second text of the mapping, the third column is a time period, which represents the time period for which the record is effective, and the fourth column is the user name.
Wherein the establishment of the second operation in the second correspondence relationship may be described with reference to 1.1 above.
2.3 establishing a delete correspondence-a third correspondence
And selecting the '@' character of the channel data in the second table by the user III, selecting the determined option, then selecting the '%' character of the channel data, and then controlling the processing equipment to select the deletion option, wherein the current time is 2016, 12, 11, and 46, and a corresponding relation is established on the processing equipment to obtain the following data.
Table a 3:
first row Second column Third column
“%” 2016121113:46 Zhang three
“@” 2016121113:46 Zhang three
The data in 23 above, i.e. this correspondence, corresponds to the third correspondence in 1.3 above, and the specific numerical value of this correspondence can be represented using a table, for example, table a3 above, in which there are three columns, the first column stores scrambled characters, such as "%, -" etc., the second column is a time period, which represents the time period during which this record is effective, and the third column is a user name.
Wherein the establishment of the third operation in the third correspondence relationship may refer to the description in 1.1 above.
2.4 establishing a hidden correspondence-a fourth correspondence
And selecting channel data 'xx 4486 e' in the channel data of the second table by the user III, controlling the processing equipment to select a hidden option, and establishing a corresponding relation on the processing equipment if the current time is 2016, 12, month, 11, day 13:50 to obtain the following data:
table a 4:
first row Third column Fourth column
xx4486e 2016121113:50 Zhang three
The data in 2.4 above, i.e. this correspondence, corresponds to the fourth correspondence in 1.4 above, and the specific numerical value of this correspondence can be represented using a table, such as table a4 above, where there are three columns, the first column storing the scrambled characters, the second column being the time period indicating the time period for which this record is in effect, and the third column being the user name.
Wherein the establishment of the fourth operation in the fourth correspondence relationship may refer to the description in 1.4 above.
For example, in the establishment of the correspondence, the correspondence that includes the operation of precise replacement is set as a first priority, the correspondence that includes the operation of fuzzy replacement is set as a second priority, the correspondence that includes the operation of deletion is set as a third priority, and the correspondence that includes the operation of hiding is set as a fourth priority.
The establishment time of each table and other dimension information of each corresponding relationship of the user name can enable the corresponding relationships to have various using modes according to the dimension information, and the diversified requirements of users are met. The establishing time is a first preset index, and a second preset index matched with the first preset index is further included in the subsequently acquired text data, namely the first preset index can be used for being matched with the corresponding index on the text data. Of course, these first and second predetermined indicators may be other types of information.
It will be appreciated that each of the above types of correspondence may include a plurality, for example, the first correspondence includes two, and it may also include more, such as 5, 12, etc.
After the corresponding relations are established, the corresponding relations can be stored on the processing device in various ways, for example, in the form of table or in the form of character string.
After the corresponding relationship is established, the subsequent text data processing operation can be performed, which is described in detail below, and in order to make the description more intuitive, the following description is performed with the IPTV domain and the text data as the channel data.
Step 201: a query dataset is obtained.
The query data set includes channel data, which is text data according to the embodiment of the present invention, and a scrambling code may occur in the channel data. Wherein the messy codes refer to characters to be processed, and the messy codes are not needed to be used or wrong by users.
For example, after the user "zhang san" logs in a data query page of the IPTV processing device using the user name "zhang san", the user selects a query operation, and issues a query instruction, so that the processing device obtains a query data set including channel data, as shown in table two below.
Table Y.0:
channel with a plurality of channels Number of people watching Duration of viewing
China satellite TV ~ E 1000 2300
Center # # channel A 2000 4000
'Huaxia' health vision 3000 5300
xx4486e 100 5000
After the processing device obtains the query data set, the processing device may obtain the established correspondence from the storage unit to query and process the scrambling codes of the channel data in the query data set.
After the pre-established corresponding relations are obtained, the corresponding relations can be screened to select the corresponding relations meeting the requirements for subsequent messy code processing.
For example, optionally, the number of the corresponding relations includes a plurality of corresponding relations, each corresponding relation further includes a first preset index, and the text data further includes a second preset index. Therefore, before judging whether the text data includes messy codes in the pre-established corresponding relationship, the method of the embodiment of the invention further comprises the following steps: and determining a target corresponding relation corresponding to the first preset index and the second preset index from the corresponding relations. And when judging whether the text data comprises messy codes in the pre-established corresponding relationship subsequently, the method can be realized by judging whether the text data comprises the messy codes in the target corresponding relationship.
The first and second pre-indicators include multiple forms, for example, the first pre-indicator is a first establishment time of the corresponding relationship, and the second pre-indicator is a second establishment time of the text data.
Alternatively, the screening of the corresponding relationship can be realized by the following steps:
the number of the corresponding relations comprises a plurality of the corresponding relations, each corresponding relation also comprises a user name,
before judging whether the text data comprises messy codes in the pre-established corresponding relation, the method further comprises the following steps: acquiring a user name of a current operation user; and determining the corresponding relation of the user name which is the same as the user name of the current operation user from the plurality of corresponding relations. Therefore, the step of subsequently judging whether the text data comprises messy codes in the pre-established corresponding relationship can be realized by judging whether the text data comprises the messy codes in the determined corresponding relationship.
For example, the processing device performs screening processing on the pre-established corresponding relationship according to the type and query date of the user, and performs filtering according to the pre-stored user name and setup time on the first, second, third, and fourth corresponding relationships, and the filtered corresponding relationships satisfy the two requirements at the same time: the user name of the corresponding relation is the same as the user name 'zhang san' of the current operation user, and the establishment time of the query data set, namely the acquisition time of the channel data, belongs to the establishment time range of the corresponding relation.
The user name in the corresponding relationship is "public", and the corresponding relationship is suitable for all users to use, that is, the user name "public" in the corresponding relationship is the same as the user name of any other current operation user.
The correspondences thus obtained are subsets of the first, second, third and fourth correspondences, respectively. Then, the following flow is executed using the screened correspondence relationship.
Step 202: and judging whether the channel data comprises messy codes in the pre-established corresponding relation. If the channel data includes the scrambling code in the corresponding relationship, go to step 203.
Step 203: and processing the channel data by using the operation corresponding to the messy codes in the channel data according to the corresponding relation so as to eliminate the messy codes in the channel data.
And if the channel data comprises the messy codes in the corresponding relation, processing the channel data by using the operation corresponding to the messy codes in the channel data according to the corresponding relation so as to eliminate the messy codes in the channel data. If the channel data does not include the messy codes in the corresponding relationship, the processing is not carried out.
If the corresponding relationship comprises at least two types, and the corresponding relationship of different types corresponds to different types of operations and different levels of priority;
the specific execution process of judging whether the text data comprises messy codes in the pre-established corresponding relationship is as follows: and judging whether the text data comprises messy codes in the corresponding relations or not by using the corresponding relations of different types from first to last according to the level sequence of the priority.
That is, before determining whether the text data includes the messy codes in the pre-established correspondence, the correspondence is selected one by one according to the rank order of the priority, and the above steps 202 and 203 are executed in turn by the selected correspondence until the correspondence is used or the channel data of the query data set of step 201 is determined to be the true channel data.
For example, when the correspondence is the above-described established accurate replacement correspondence, fuzzy replacement correspondence, deletion correspondence, or hiding correspondence, the correspondence is screened according to the establishment time and the user name to obtain subsets of the four correspondences, and then the subsets are used to perform the following four-step operation according to the priority.
3.1 first step, exact matching of query dataset
After using the user of the fourth column and the temporal filtering of the third column, a subset of exact replacement correspondences is obtained, and then step 202 and step 203 are performed using this subset. That is, the channel data column of the query data set is the dimension column in which scrambling codes occur, and the system of the processing device will make an exact match between the channel column of the table Y.0 of the query data set and the first column of table a 1.1.
The reason why table a1.1 is selected and table a1.2 is not selected is the result of the screening according to the establishment time of the exact substitution correspondence. For example, the establishment time of the query data set, i.e., the establishment time of the channel data is 2016, 12, and 11 days, the establishment time of table a1.1 is 2016121113:00, the establishment time of a1.2 is 2016121113:00-2016121210:00, and the establishment time of the query data set belongs to the establishment time of table a1.1, so that the processing device selects the corresponding relationship of table a1.1 for accurate matching.
After matching, the channel data of the first line of the query data set, namely the China satellite television- "is a random code in one of the corresponding relations which is accurately replaced, and the channel data comprising the random code, namely the China satellite television-" is replaced by a first text, namely the China satellite television according to the first operation of the corresponding relation "
Table Y.1 is thus obtained as follows:
channel with a plurality of channels Number of people watching Duration of viewing
China satellite television 1000 2300
Center # # channel A 2000 4000
'Huaxia' health vision 3000 5300
xx4486e 100 5000
3.2 second step, fuzzy matching of query data set
Since all correspondences cannot be exhausted in the exact replacement correspondences, fuzzy matching is required at this step.
After the temporal filtering using the user of the fourth column and the third column, a subset of fuzzy replacement correspondences is obtained, and then step 202 and step 203 are performed using the subset. That is, after the first step of processing, the system of the processing device may fuzzy match the channel column of table Y.1 and the first column of table a2 for the query dataset.
After matching, the channel data "center # # channel a" in the second row of the query dataset includes a scrambling code "center" in one of the fuzzy replacement correspondences, and the channel data including the scrambling code "center" is replaced with a second text "central satellite view" according to a second operation of the correspondence.
Table Y.2 is thus obtained as follows:
channel with a plurality of channels Number of people watching Duration of viewing
China satellite television 1000 2300
Central satellite television 2000 4000
'Huaxia' health vision 3000 5300
xx4486e 100 5000
3.3, step three, fuzzy matching is carried out on the query data set
The third step was performed using table 7.2 obtained in the second step.
After the user using the third column and the temporal filtering of the second column, a subset of deletion correspondences is obtained, and then step 202 and step 203 are performed using the subset. That is, after the second step of processing, the system of the processing device matches the channel column of table Y.2 for the query dataset with the first column of table A3.
After matching, the channel data of the third row of the query data set, namely 'chinese% @ satellite view', includes the messy codes "%" and "@" in two of the correspondence relations, and the messy codes "%" and "@" are deleted from the channel data according to the third operation of the correspondence relation.
Table Y3 was thus obtained as follows:
Figure BDA0001200442570000131
Figure BDA0001200442570000141
3.4, step four, carrying out hidden matching on the query data set
After using the users of the third column, and the temporal filtering of the second column, a subset of hidden correspondences is obtained, and then step 202 and step 203 are performed using this subset. That is, after the third step, the system of the processing device matches the channel column of table Y.3 for the query data set with the first column of table A4.
After matching, the channel data "xx 4486 e" in the fourth row of the query dataset includes the scrambling code "xx 4486 e" in one of the hidden correspondences, and the fourth operation according to the correspondence hides the channel data including the scrambling code "xx 4486 e".
Table Y.4 is thus obtained as follows:
channel with a plurality of channels Number of people watching Duration of viewing
China satellite television 1000 2300
Central satellite television 2000 4000
China satellite television 3000 5300
Thus, after the processing of the steps, the messy codes of the channel data of the query data set are eliminated, so that the problem of the messy codes in the text data is solved, and when the text data is one dimension of the multi-dimension data, for example, the channel data is one dimension of the query data set, the problem of the messy codes of the dimension data is solved by the method.
By the method, the problem of messy codes in IPTV data is solved, various conditions are comprehensively considered, the problem of messy codes can be solved through the operation of a user, and the intervention of developers and operation and maintenance personnel is avoided.
In summary, after the text data is acquired, it is determined whether the text data includes a messy code in a pre-established correspondence relationship, where the correspondence relationship includes a correspondence relationship between the messy code and the operation. If the text data comprises messy codes in the corresponding relation, the text data is processed by using the operation corresponding to the messy codes according to the corresponding relation, so that the messy codes in the text data can be eliminated. Therefore, the text data comprising the messy codes is compared with the messy codes of the corresponding relation, and if the text data comprises the messy codes of the corresponding relation, the messy codes can be eliminated from the text data by using the operation of the corresponding relation.
Fig. 3 is a schematic structural diagram of a scrambling code processing apparatus according to an embodiment of the present invention. The apparatus may be integrated on a processing device to perform the methods illustrated in fig. 1 and 2, and referring to fig. 3, the apparatus of an embodiment of the present invention comprises:
an acquisition unit 301 configured to acquire text data;
a judging unit 302, configured to judge whether the text data includes a messy code in a pre-established correspondence, where the correspondence includes a correspondence between the messy code and an operation;
the processing unit 303 is configured to, if the text data includes a messy code in the corresponding relationship, process the text data by using an operation corresponding to the messy code according to the corresponding relationship, so as to eliminate the messy code in the text data.
Alternatively,
the corresponding relation comprises at least two types, and the corresponding relations of different types correspond to different types of operations and different levels of priority;
the determining unit 302 is further configured to determine, according to the priority ranking, whether the text data includes a messy code in the corresponding relationship by using the corresponding relationships of different types from first to last.
Alternatively,
the corresponding relations comprise a first corresponding relation, a second corresponding relation, a third corresponding relation and a fourth corresponding relation,
the first corresponding relation comprises a first messy code and a corresponding relation of a first operation, the first operation is to replace the text data to be processed comprising the first messy code with a first text, and the first messy code comprises all characters in the text data to be processed;
the second corresponding relation comprises a corresponding relation between second messy codes and a second operation, the second operation is to replace the text data to be processed comprising the second messy codes with a second text, and the second messy codes are partial characters in the text data to be processed;
the third corresponding relation comprises a corresponding relation between a third messy code and a third operation, and the third operation is to delete the third messy code from the text data to be processed;
the fourth corresponding relation comprises a fourth messy code and a fourth operation, and the fourth operation is used for hiding the text data to be processed comprising the fourth messy code.
Alternatively,
the priority levels of the corresponding relations are as follows from high to low in sequence: a first correspondence, a second correspondence, a third correspondence, and a fourth correspondence.
Alternatively,
the number of the corresponding relations comprises a plurality of corresponding relations, each corresponding relation also comprises a first preset index, the text data also comprises a second preset index,
the device of the embodiment of the invention also comprises:
an index determining unit 305 configured to determine a target correspondence relationship corresponding to a first preset index and a second preset index from among the plurality of correspondence relationships;
the determining unit 302 is further configured to determine whether the text data includes a messy code in the target correspondence.
Alternatively,
the first preset index is the first establishment time of the corresponding relation, and the second preset index is the second establishment time of the text data.
Alternatively,
the number of the corresponding relations comprises a plurality of the corresponding relations, each corresponding relation also comprises a user name,
the device of the embodiment of the invention also comprises:
a name obtaining unit 306, configured to obtain a user name of a current operation user;
a name determining unit 304, configured to determine, from the multiple correspondence relationships, a correspondence relationship in which a user name is the same as a user name of a currently operating user;
the determining unit 302 is further configured to determine whether the text data includes a messy code in the determined corresponding relationship.
In summary, after the obtaining unit 301 obtains the text data, the determining unit 302 determines whether the text data includes a messy code in a pre-established correspondence relationship, where the correspondence relationship includes a correspondence relationship between the messy code and the operation. If the text data includes the messy codes in the corresponding relationship, the processing unit 303 processes the text data according to the corresponding relationship by using the operation corresponding to the messy codes, so as to eliminate the messy codes in the text data. Therefore, the text data comprising the messy codes is compared with the messy codes of the corresponding relation, and if the text data comprises the messy codes of the corresponding relation, the messy codes can be eliminated from the text data by using the operation of the corresponding relation.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (8)

1. A method for handling scrambled codes, comprising:
acquiring text data;
judging whether the text data comprises messy codes in a pre-established corresponding relation, wherein the corresponding relation comprises the messy codes and the corresponding relation of operation;
if the text data comprises messy codes in the corresponding relation, processing the text data by using operations corresponding to the messy codes according to the corresponding relation so as to eliminate the messy codes in the text data;
the corresponding relations comprise at least two types, and the corresponding relations of different types correspond to different types of operations and different levels of priorities;
the judging whether the text data comprises messy codes in the pre-established corresponding relation comprises the following steps:
and judging whether the text data comprises messy codes in the corresponding relations or not by using the corresponding relations of different types from first to last according to the level sequence of the priority.
2. The method of claim 1,
the corresponding relations comprise a first corresponding relation, a second corresponding relation, a third corresponding relation and a fourth corresponding relation,
the first corresponding relation comprises a corresponding relation between a first messy code and a first operation, the first operation is to replace the text data to be processed comprising the first messy code with a first text, and the first messy code comprises all characters in the text data to be processed;
the second corresponding relation comprises a corresponding relation between a second messy code and a second operation, the second operation is to replace the text data to be processed comprising the second messy code with a second text, and the second messy code is a part of characters in the text data to be processed;
the third corresponding relation comprises a corresponding relation between a third messy code and a third operation, and the third operation is to delete the third messy code from the text data to be processed;
the fourth corresponding relation comprises a corresponding relation between a fourth messy code and a fourth operation, and the fourth operation is to hide the text data to be processed comprising the fourth messy code.
3. The method of claim 2,
the priority levels of the corresponding relations are sequentially from high to low: the first correspondence, the second correspondence, the third correspondence, and the fourth correspondence.
4. The method according to any one of claims 1 to 3,
the number of the corresponding relations comprises a plurality of corresponding relations, each corresponding relation further comprises a first preset index, the text data further comprises a second preset index,
before the determining whether the text data includes messy codes in the pre-established correspondence, the method further includes:
determining a target corresponding relation between the first preset index and the second preset index from the corresponding relations;
the judging whether the text data comprises messy codes in the pre-established corresponding relation comprises the following steps:
and judging whether the text data comprises messy codes in the target corresponding relation.
5. The method of claim 4,
the first preset index is the first establishment time of the corresponding relation, and the second preset index is the second establishment time of the text data.
6. The method according to any one of claims 1 to 3,
the number of the corresponding relations comprises a plurality of corresponding relations, each corresponding relation also comprises a user name,
before the determining whether the text data includes messy codes in the pre-established correspondence, the method further includes:
acquiring a user name of a current operation user;
determining the corresponding relation between the user name and the user name of the current operation user from the corresponding relations;
the judging whether the text data comprises messy codes in the pre-established corresponding relation comprises the following steps:
and judging whether the text data comprises messy codes in the determined corresponding relation.
7. An apparatus for handling scrambled code, comprising:
an acquisition unit configured to acquire text data;
the judging unit is used for judging whether the text data comprises messy codes in a pre-established corresponding relation, and the corresponding relation comprises the messy codes and the corresponding relation of operation;
the processing unit is used for processing the text data by using the operation corresponding to the messy codes according to the corresponding relation if the text data comprises the messy codes in the corresponding relation so as to eliminate the messy codes in the text data;
the corresponding relations comprise at least two types, and the corresponding relations of different types correspond to different types of operations and different levels of priorities;
the judging unit is further configured to judge whether the text data includes a messy code in the corresponding relationship by using different types of corresponding relationships first and then according to the priority level sequence.
8. The apparatus of claim 7,
the corresponding relations comprise a first corresponding relation, a second corresponding relation, a third corresponding relation and a fourth corresponding relation,
the first corresponding relation comprises a corresponding relation between a first messy code and a first operation, the first operation is to replace the text data to be processed comprising the first messy code with a first text, and the first messy code comprises all characters in the text data to be processed;
the second corresponding relation comprises a corresponding relation between a second messy code and a second operation, the second operation is to replace the text data to be processed comprising the second messy code with a second text, and the second messy code is a part of characters in the text data to be processed;
the third corresponding relation comprises a corresponding relation between a third messy code and a third operation, and the third operation is to delete the third messy code from the text data to be processed;
the fourth corresponding relation comprises a corresponding relation between a fourth messy code and a fourth operation, and the fourth operation is to hide the text data to be processed comprising the fourth messy code.
CN201611264769.XA 2016-12-30 2016-12-30 Method and device for processing messy codes Active CN108271041B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611264769.XA CN108271041B (en) 2016-12-30 2016-12-30 Method and device for processing messy codes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611264769.XA CN108271041B (en) 2016-12-30 2016-12-30 Method and device for processing messy codes

Publications (2)

Publication Number Publication Date
CN108271041A CN108271041A (en) 2018-07-10
CN108271041B true CN108271041B (en) 2021-01-22

Family

ID=62770158

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611264769.XA Active CN108271041B (en) 2016-12-30 2016-12-30 Method and device for processing messy codes

Country Status (1)

Country Link
CN (1) CN108271041B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0728810A (en) * 1993-07-14 1995-01-31 Matsushita Electric Ind Co Ltd Character processing method and device therefor
CN102479174A (en) * 2010-11-23 2012-05-30 盛乐信息技术(上海)有限公司 Chinese character automatic checking and error-correcting system aiming at GBK (Chinese Internal Code Specification) encoding and method thereof
CN104424010A (en) * 2013-09-06 2015-03-18 北大方正集团有限公司 Method and system for detecting and repairing text document messy codes
CN104516862A (en) * 2013-09-29 2015-04-15 北大方正集团有限公司 Method and system for selecting and reading coded format of target document
CN104750663A (en) * 2013-12-27 2015-07-01 阿里巴巴集团控股有限公司 Identification method and device for text messy codes in page
CN105426390A (en) * 2015-10-23 2016-03-23 广东小天才科技有限公司 Image recognition-based question search method and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0728810A (en) * 1993-07-14 1995-01-31 Matsushita Electric Ind Co Ltd Character processing method and device therefor
CN102479174A (en) * 2010-11-23 2012-05-30 盛乐信息技术(上海)有限公司 Chinese character automatic checking and error-correcting system aiming at GBK (Chinese Internal Code Specification) encoding and method thereof
CN104424010A (en) * 2013-09-06 2015-03-18 北大方正集团有限公司 Method and system for detecting and repairing text document messy codes
CN104516862A (en) * 2013-09-29 2015-04-15 北大方正集团有限公司 Method and system for selecting and reading coded format of target document
CN104750663A (en) * 2013-12-27 2015-07-01 阿里巴巴集团控股有限公司 Identification method and device for text messy codes in page
CN105426390A (en) * 2015-10-23 2016-03-23 广东小天才科技有限公司 Image recognition-based question search method and system

Also Published As

Publication number Publication date
CN108271041A (en) 2018-07-10

Similar Documents

Publication Publication Date Title
US9189641B2 (en) Methods and systems for deleting requested information
US10055106B2 (en) Information display processing system, information display processing method, and program recording medium
EP3396558B1 (en) Method for user identifier processing, terminal and nonvolatile computer readable storage medium thereof
CN110896488B (en) Recommendation method for live broadcast room and related equipment
CN111432263B (en) Barrage information display, processing and release method, electronic equipment and medium
CN110990103B (en) Interface display method, device, equipment and storage medium
WO2015176652A1 (en) Network service recommendation method and apparatus
CN106997431A (en) A kind of data processing method and device
CN105955569A (en) File sharing method and apparatus
CN106649363A (en) Data query method and device
CN104507069B (en) A kind of terminal user ID recognition methods and system
JP6484657B2 (en) Information processing apparatus, information processing method, and program
CN108271041B (en) Method and device for processing messy codes
CN110213630B (en) Video processing method and device, electronic equipment and medium
CN110765782B (en) Key value-based field translation method, device, computer equipment and storage medium
US10503362B2 (en) Method and apparatus for image selection
CN104750774A (en) Database upgrading method and device
CN110910108B (en) Data association method and device, electronic equipment and storage medium
CN106844406B (en) Search method and search device
JP7171275B2 (en) Image evaluation device, system, control method and program for image evaluation device
CN106055229B (en) Display interface adjusting method and display interface adjusting module based on screen reading
CN107563219B (en) Database management method and device
CN111782684B (en) Distribution network electronic handover information matching method and device
CN110806696B (en) Method and device for determining household control application theme and computer storage medium
CN110866037B (en) Message filtering method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100080 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Applicant after: Beijing Guoshuang Technology Co.,Ltd.

Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing

Applicant before: Beijing Guoshuang Technology Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant