WO2020194323A1 - Method and system for message spam detection in communication networks - Google Patents

Method and system for message spam detection in communication networks Download PDF

Info

Publication number
WO2020194323A1
WO2020194323A1 PCT/IN2019/050253 IN2019050253W WO2020194323A1 WO 2020194323 A1 WO2020194323 A1 WO 2020194323A1 IN 2019050253 W IN2019050253 W IN 2019050253W WO 2020194323 A1 WO2020194323 A1 WO 2020194323A1
Authority
WO
WIPO (PCT)
Prior art keywords
spam
message
user
sender
flag
Prior art date
Application number
PCT/IN2019/050253
Other languages
French (fr)
Inventor
Mahesh Babu JAYARAMAN
Sandhya BASKARAN
Perepu SATHEESH KUMAR
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Priority to PCT/IN2019/050253 priority Critical patent/WO2020194323A1/en
Publication of WO2020194323A1 publication Critical patent/WO2020194323A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/107Computer-aided management of electronic mailing [e-mailing]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/21Monitoring or handling of messages
    • H04L51/212Monitoring or handling of messages using filtering or selective blocking

Definitions

  • the present disclosure relates to the field of network communication; and more specifically, to message spam detection in communication networks.
  • SMS short message service
  • MMS Multimedia Messaging Service
  • MMS is another mechanism that can be used to send messages to and from a wireless phone over a wireless communication network.
  • the MMS is a standard protocol that extends the core SMS capability, allowing the exchange of text messages greater in length. Unlike text-only SMS, MMS can deliver a variety of media, including videos, images, or audio.
  • a spam message (e.g., SMS spam or MMS spam) is an unwanted or an unsolicited message (such as SMS messages or MMS messages) that is sent indiscriminately to a device. Often, a spam message is sent for marketing purposes.
  • a spam message can take the form of a simple message, a link to a number to call or text, a multimedia content, a link to a website for more information or a link to a website to download an application.
  • Spam messages are problems to users of devices as they bombard the users with irrelevant or uncalled for messages draining time and resources.
  • Some mechanisms exist today for detection of spam messages Some existing spam detection mechanisms use the content of the message to detect a spam message. These approaches are only interested in classifying the content of the message as being spam or not without any consideration of or focus on the sender which may also be a spam sender.
  • Some other approaches of spam detection such as Do Not Disturb [ DND] set by operators, either block all the messages from a sender or unblock the messages based on the user interests and registrations.
  • the location of the sender of the message can also be considered in addition to the content of the message sent to determine whether the message is a spam message or not.
  • this spam detection approach similarly to the other ones is only interested in classifying the message as a spam or not rather than classifying the user.
  • One general aspect includes a method performed by a network node for spam detection in a communication network, the method including: receiving, from a first device of a first sender, a message addressed to one or more receivers; determining a set of features for the message, where the set of features includes at least one or more content features that relates to content of the message, and one or more user features that relate to the first sender; determining, based on a spam prediction model and the set of features for the message, whether the first sender is a spam sender and whether the message includes spam content; responsive to determining that at least one of the first sender is a spam sender and the message includes spam content, transmitting a notification to one or more devices of the one or more receivers including the at least one of a user flag and a message flag, where the user flag indicates that the first sender is a spam user and the message flag indicates that the message includes spam content; and responsive to determining that the first sender is not a spam sender and the message does not include spam content, transmit
  • One general aspect includes a network node for spam detection in a communication network, the network node including: one or more processors; and non- transitory computer readable storage media that stores instructions, which when executed by the one or more processors cause the network node to: receive, from a first device of a first sender, a message addressed to one or more receivers; determine a set of features for the message, where the set of features includes at least one or more content features that relates to content of the message, and one or more user features that relate to the first sender; determine, based on a spam prediction model and the set of features for the message, whether the first sender is a spam sender and whether the message includes spam content.
  • the network node is further to, responsive to determining that at least one of the first sender is a spam sender and the message includes spam content, transmit a notification to one or more devices of the one or more receivers including the at least one of a user flag and a message flag, where the user flag indicates that the first sender is a spam user and the message flag indicates that the message includes spam content; responsive to determining that the first sender is not a spam sender and the message does not include spam content, transmit the message to the one or more devices of the one or more receivers.
  • Figure 1 illustrates a block diagram of an exemplary system for enabling message spam detection in a communication network, in accordance with some embodiments.
  • Figure 2 illustrates a block diagram of a message feature extractor, in accordance with some embodiments.
  • Figure 3 illustrates a block diagram of exemplary spam feedback repository and message repository, in accordance with some embodiments.
  • Figure 4 illustrates a block diagram of an exemplary prediction model generator, in accordance with some embodiments.
  • Figure 5 illustrates a flow diagram of exemplary operators for message spam detection in accordance with some embodiments.
  • Figure 6 illustrates a flow diagram of exemplary notification that can be sent to a user following detection of message spam in accordance with some embodiments.
  • Figure 7 illustrates a flow diagram of exemplary operations for receiving spam feedback from a device, in accordance with some embodiments.
  • Figure 8 illustrates a flow diagram of exemplary operations for receiving spam feedback from another device, in accordance with some embodiments.
  • Figure 9 illustrates a block diagram for a network device that can be used for implementing one or more of the network devices described herein, in accordance with some embodiments.
  • Figure 10 illustrates a block diagram for a device that can be used for implementing one or more devices described herein, in accordance with some embodiments.
  • references in the specification to“one embodiment,” “an embodiment,” “an example embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
  • Bracketed text and blocks with dashed borders may be used herein to illustrate optional operations that add additional features to embodiments of the inventive concept. Flowever, such notation should not be taken to mean that these are the only options or optional operations, and/or that blocks with solid borders are not optional in certain embodiments of the inventive concept.
  • Coupled is used to indicate that two or more elements, which may or may not be in direct physical or electrical contact with each other, co-operate or interact with each other.
  • Connected is used to indicate the establishment of communication between two or more elements that are coupled with each other.
  • the embodiments herein present a mechanism for automatically classifying a sender of a message in a communication network as being a spam user or the message received from this sender as being a spam message. This allows the detection of spam users which can send a normal message (e.g.. a bank transaction related message), the detection of a normal user who sent a spam message (e.g., a marketing related message), or the detection of a spam user that sent a message including spam content.
  • a normal message e.g.. a bank transaction related message
  • a normal user who sent a spam message e.g., a marketing related message
  • the embodiments herein present several advantages when compared with existing spam detection solutions.
  • the proposed solutions enhance mobile network user experience by providing a more nuanced information on the type of spam that can be received by a user.
  • the user By transmitting a flag identifying whether a sender of a message is a spam user and/or another flag identifying whether the message includes spam content, the user is presented with a more detailed classification of the messages they receive.
  • the solution presented herein enable the transmission of a notification to a receiver of a message for notifying the receiver that there is a potential of spam.
  • This notification can include the message and one or more additional spam flags for the message that indicate whether one or both of the message content and the sender are spam.
  • the notification may not include the message, e.g., the message is blocked, and the notification only includes the spam flag(s) to inform the user that there is a potential of a user spam or content spam.
  • the mechanisms described herein enable users that receive messages to report a message and/or a sender of the message as spam.
  • the solution presented herein provides a dynamic spam detection platform that is adaptable to feedback received from the end-user.
  • the dynamic spam detection platform can be used in a messaging service platform that can be either a SMS-based service or an MMS- based service.
  • a messaging service platform can be either a SMS-based service or an MMS- based service.
  • the current solution adapts to the actual user experience resulting in an improved user experience of the messaging service,
  • Figure 1 illustrates a block diagram of an exemplary system for enabling message spam detection in a communication network, in accordance with some embodiments.
  • a wireless network may further include any additional elements suitable to support communication between wireless devices or between a wireless device and another communication device, such as a landline telephone, a service provider, or any other network node or end device.
  • the wireless network may provide communication and other types of services to one or more wireless devices to facilitate the wireless devices’ access to and/or use of the services provided by, or via, the wireless network.
  • the wireless network 100 may comprise and/or interface with any type of communication, telecommunication, data, cellular, and/or radio network or other similar type of system.
  • the wireless network may be configured to operate according to specific standards or other types of predefined rules or procedures.
  • particular embodiments of the wireless network may implement communication standards, such as Global System for Mobile Communications (GSM), Universal Mobile Telecommunications System (UMTS), Long Term Evolution (LTE), and/or other suitable 2G, 3G, 4G, or 5G standards; wireless local area network (WLAN) standards, such as the IEEE 802.11 standards; and/or any other appropriate wireless communication standard, such as the Worldwide Interoperability for Microwave Access (WiMax), Bluetooth, Z-Wave and/or ZigBce standards.
  • the wireless network 100 may comprise any number of wired or wireless networks, base stations, controllers, wireless devices, relay stations, and/or any other components or systems that may facilitate or participate in the communication of data and/or signals whether via wired or wireless connections.
  • Each one of the wireless devices 102A-102N is an electronic device that is operative to communicate and connect with the message management center 110 through a combination of wireless and wired network technologies.
  • Each one of the wireless devices 102A-102N also referred to herein as WD 102A-N, is capable of transmitting one or multiple messages, e.g., messages 101A-M, through the network 107 to the message management center 110.
  • the WD 102A-N are also operative to receive messages 105A-L or notifications 103 A-K from the message management center 110.
  • the WD 102A-N are also operative to transmit spam feedback 109 to the message management center 110.
  • Figure 1 illustrates a set of WDs 102A-N
  • any number of WDs can transmit and/or receive messages from the message management center.
  • Each one of the WDs can be associated with a user of the device.
  • a user can be a person handling the device and which has a subscription to a wireless service in the network 100.
  • the user of the WD can be a corporation, a hot. or any other entity that is operative to handle/control a wireless device and generate and transmit messages from the wireless device.
  • a receiver or recipient can be interchangeably used to refer to a user of the WD when the wireless device receives messages.
  • the message management center 110 includes a message gateway 120, a message repository 130, a message feature extractor 140, a message spam manager 150, a prediction model generator 160, and a spam feedback repository 170.
  • the message gateway 120 is a network node that is operative to receive messages from WDs 102A-N.
  • the message gateway 120 is further operative to transmit messages 105A-L and notifications 103A-K to one or more of the WDs 102A-N.
  • the message gateway 120 is also operative to receive spam feedback 109 from the WDs 102A-N.
  • the messages can be any of short text messages transmitted via a short message service (SMS), or a multimedia messages transmitted via a multimedia message service (MMS).
  • SMS short message service
  • MMS multimedia message service
  • the MMS extends the core SMS capability, and allows the exchange of text messages greater in length than what is allowed in SMS. Unlike text-only SMS, MMS can deliver a variety of media, including video, images, or audio.
  • the notifications 103A-K include one or more flag that indicates whether a sender of a message 105A-L is a spam user or alternatively that indicates whether the message includes spam content.
  • the notifications 103A-K may include the message itself.
  • the notifications 103A-K may include only the flag, or the flag and one or more additional information related to the message or the sender that caused the generation of the spam flag.
  • the additional information can include an identifier of the sender.
  • the spam feedback 109 is a feedback provided by the receiver of a message indicating whether the message received included spam content or originated from a spam user.
  • the message management center 110 includes a message repository 130 and a spam feedback repository 170.
  • the message repository 130 is adapted to store the messages 101 A- M received from the WD 102A-N. Each one of the message gateway 120 and the message feature extractor 140 can retrieve one or more messages from the message repository 130 when needed.
  • the spam feedback repository 170 is adapted to store the spam feedback received from the WDs 102A-N.
  • the spam feedback repository 170 is adapted to transmit the spam feedback to the prediction model generator 160.
  • the message management center 110 can be used in a wireless network to provide a messaging service to subscribers/users.
  • the message management center 110 is operative to perform spam detection in the messaging service.
  • the message feature extractor 140 receives one or more messages from the message repository 130 and determines a set of features 111 for the message.
  • the set of features 111 can include a set of content-based feature and a set of user-based features.
  • the content-based features include features that are determined based on the content of the message received.
  • the user-based features include features determined based on the sender of the message.
  • the prediction model generator 160 is a network node that is operative to receive the set of features 111 determined for multiple messages and determine a spam prediction model 113.
  • the spam prediction model can be determined based on message features as well as based on spam feedback stored in the spam feedback repository 170.
  • the spam prediction model 152 is fed to the message spam manager 150 to be used within the message spam detector 151.
  • the message spam detector 151 Upon determination that the sender of the message is a spam user the message spam detector 151 outputs a user flag indicating that the sender of the message is a spam user.
  • the message spam detector 151 Upon determination that the message includes spam content, the message spam detector 151 outputs a
  • the message spam manager 150 includes a message spam detector 151 and a message spam flagger 153.
  • the message spam detector 151 includes the spam prediction model 152 and is operative to determine based on the spam prediction model and the set of features 111 for a message, whether a message spam includes spam content and whether the sender of the message is a spam user.
  • the message spam flagger 153 is operative to receive at least one of the user flag and the message flag for a message and generate a notification to be sent to the received based on these flags. The notification is sent to the message gateway 120 to be transmitted to the receiver, e.g., WD 102B.
  • the message spam flagger 153 may add a spam flag entry into the header/footer of a messages that is to be sent to the receiver. This is performed through an enhanced messaging protocol that allows for the transmission of additional spam information with the messages transmitted. Alternatively, the spam flags are transmitted as messages, without the transmission of the spam message.
  • the components 140, 150, 160, and 170 are new elements added to a standard message management center 110.
  • the components 140-170 are operative to perform novel and new operations of spam detection in a message center. While the components 140-170 are shown as part of the message management center 110 as the set 180, in other embodiments, these elements can be separate from the message management center 110 offered as an ad-on to the services of message management center 110.
  • the set of new components 180 which uses blended classification model for detecting spam in messaging services such as SMS or MMS.
  • the message management center 110 is operative to receive, from a first wireless device, e.g., WD 102A, of a first sender (not illustrated), a message 101A addressed to one or more receivers.
  • a first wireless device e.g., WD 102A
  • the message 101 A can be addressed to a user of the WD 102B only, or to the user of the WD 102B and users of one or more additional WDs such as WD 102W.
  • the message management center 110 determines a set of features for the message.
  • the set of features includes content features and user features.
  • the content features relate to the content of the message, and the user features relate to the sender of the message.
  • the message management center 110 determine, based on a spam prediction model, e.g., spam prediction model 152, and the set of features for the message, whether the first sender is a spam sender and whether the message includes spam content. Responsive to determining that at least one of the first sender is a spam sender and the message includes spam content, the message management center 110 transmits a notification to one or more wireless devices, e.g., WD 102B-N, of the one or more receivers including the at least one of a user flag and a message flag. The user flag indicates that the first sender is a spam user and the message flag indicates that the message includes spam content.
  • a spam prediction model e.g., spam prediction model 152
  • the message management center 110 may determine based on the spam prediction model 152 that the message does not include spam content and that the sender of the message is not a spam user, in that case, the message management center 110 transmits the message 101A to the one or more wireless devices of the one or more receivers without any modifications.
  • the message management center 110 is further operative to receive spam feedback 109 from the WDs 102A-N and use the spam feedback in the determination of the spam prediction model 152.
  • the spam feedback can be received as a result of the WD receiving a notification and determining whether the notification is accurate or not.
  • the spam feedback can be received as a result of the receipt of a message and determining whether the message includes spam content, and/or the sender of the message is a spam user.
  • the message management center 110 is operative to receive from a WD spam feedback that indicates that the sender is not a spam user, when the WD had received a notification indicating that the sender of a message is a spam user.
  • the message management center 110 may transmit the message(s) that caused the transmission of the user spam notification to the receiver and which were previously blocked as a result of a determination that the sender of the message(s) is a spam user.
  • the message management center 110 may determine based on a message identifier included in the spam feedback, the message that caused the transmission of the spam flag to the WD, retrieve the message from the message repository 130 and transmit the message to the WD,
  • the message management center 110 is operative to offer an enhanced message service (e.g.. an enhanced SMS service or MMS service) that enables spam detection in exchanged messages.
  • an enhanced message service e.g.. an enhanced SMS service or MMS service
  • the embodiments herein present several advantages when compared with existing spam detection solutions.
  • the proposed solutions enhance mobile network user experience by providing a more nuanced information on the type of spam that can be received by a user. By transmitting a flag identifying whether a sender of a message is a spam user and/or another flag identifying whether the message includes spam content, the user is presented with a more detailed classification of the messages they receive.
  • the solution presented herein enable the transmission of a notification to a receiver of a message for notifying the receiver that there is a potential of spam.
  • This notification can include the message and one or more additional spam flags for the message that indicate whether one or both of the message content and the user are spam.
  • the notification may not include the message, e.g,, the message is blocked, and the notification only includes the spam llag(s) to inform the user that there is a potential of a user spam or content spam.
  • the mechanisms described herein enable users to report a message and/or a sender of the message as spam.
  • the embodiments herein use a combination of content feature and user feature to determine whether the receiver is receiving spam. This allows a more robust and better-quality spam detection service offered to users of the wireless network 100 as spam may be received from non-spam users and spam users may send non-spam content.
  • the use of content features as well as user features for the detection of spam enables a more nuanced spam detection and notifications.
  • the message 101 A Upon receipt of a message 101 A in the message gateway 120 of the message management center 110 from a sender 102A, the message 101 A is stored in the message repository 130. For example, the message can be destined to one or multiple ones of the WDs 102B-N. Prior to being transmitted to the receivers 102B-N, the message is analyzed to determine whether the message includes spam content, or the user is a spam user. The spam detection is performed at least in part based on a set of features 111 extracted by the message feature extractor 140 from the message 101 A.
  • Figure 2 illustrates a block diagram of a message feature extractor, in accordance with some embodiments. The message feature extractor 140 is operative to extract features of message.
  • the features of the message can include content-based features related to the content of the message and user-based features related to the sender of the message.
  • the features may further include features that relate to the receiver of the message, and one or more additional features that relate to a history of conversation between the sender and the receiver of the message.
  • a set of features of a message includes values derived from the message and related information that form an initial set of measured data. These features are intended to be informative and non-redundant values that represent the initial set of measured data.
  • the initial set of data includes the content of the message (e.g., text, video, audio, and/or image), information on the sender of the message (e.g., location of the sender), information on the receiver of the message, and historical data related to the message and/or the sender/receiver.
  • Each set of features represents a message.
  • the features extracted from a message can be used during a learning phase, in which the message management center 110 is operative to learn based on one or more data sets and determine a spam prediction model to be used for spam detection.
  • the data sets are determined based on features of several messages when combined with spam feedback and/or result of external classification/prediction models.
  • the features of a message can further be used during a prediction phase.
  • spam flags are determined based on the spam prediction model 152 for the message 101A and the features.
  • the features 111 of the message 101A are input into a spam prediction model 152 and the spam prediction model 152 outputs a set of spam flags for the message.
  • the message feature extractor 140 includes a content feature extractor 142, a number of message recipients determiner 145, a message sender determiner 146, a time indicator determiner 147, a sender location determiner 148, and a conversation history determiner 149.
  • the message feature extractor 140 may include a message type determiner 141.
  • the message management center 110 is adapted to handle SMS messages only, the message type determiner 141 may not be included and the content feature extractor 142 includes only a text feature extractor 144.
  • the messages received do not include any multimedia content and the extraction of text feature is sufficient to represent the content of the message.
  • the message type determiner 141 can be included and the content feature extractor 142 includes a text feature extractor 144 and a multimedia feature extractor 143.
  • the messages received may include both text and multimedia content and a combination of features can be determined to represent the content of the message.
  • a message 101A is received by the message feature extractor 140.
  • the message 101 A can be received as a result of the message feature extractor 140 retrieving one or more messages from the message repository 130. For example, this can occur during a training phase of the spam prediction model 152, in which a data set is to be built based on the features of multiple messages.
  • the message can be received automatically by the message feature extractor 140 without having transmitted a request for a message. For example, this can occur during a classification/prediction phase, in which spam detection of a message is to be performed based on the features of the message.
  • the number of message recipients determiner 145 is operative to determine based on the message 101 A, a number of recipients of the message. This feature can be used for training a spam prediction model as a greater number of recipients can increase a probability of a message of being a spam message.
  • the message can be a message forwarded to multiple recipients and may be undesired by the recipients.
  • the message sender determiner 146 is operative to receive the message 101A and determine an identifier of a sender of the message.
  • the identifier of the message sender can be a phone number of the message, or any other identifier that uniquely identifies the sender of the message and/or the device from the which the message has been sent.
  • it may be of interest to identify the WD 102A from which the message was transmitted.
  • it may be of interest to identify the sender of the message independently of the device that it used as a sender can use one of multiple WDs to transmit messages.
  • the time indicator determiner 147 is operative to receive the message 101A and determine a time indicator for the message.
  • the time indicator can include a time of day, a day of the week, season, or any other time category or indication that corresponds to the time of receipt of the message or to the time of generation of the message.
  • the time indicator can be a time stamp of the message.
  • the sender location determiner 148 is operative to receive the message 101A and to determine the location indicator for a sender of the message 101 A.
  • the location indicator is the location of the sender.
  • the location indicator can include at least one of the location of the sender and one or more of a location category/type, (e.g., bank, mall, company’s headquarters, industrial location, educational location, etc.) ⁇
  • the location category/type provides enriched contextual information that can be used as a feature in the spam prediction model 152.
  • the conversation history determiner 149 is operative to receive the message 101A and determine a message conversation history.
  • the conversation history may include a measure of the length of the conversation between the sender of the message and each recipient of the message.
  • the length of a conversation can include a number of times the recipient responded to the sender.
  • the length of the conversation can include a number of messages that were sent by the sender to the recipient which received responses.
  • the length of the conversation can assist in the determination of whether the message and/or the sender are spam.
  • one or more additional features can be determined for a message and used in a training phase and a classification/prediction phase.
  • another feature can be determined based on the subscription of the sender for the message service.
  • a weight factor can be determined for a sender based on the message subscription plan that the sender has with the message service. The weight factor can be determined based on the usage pattern of the sender when compared with the subscription plan. For example, when a consistent usage pattern is identified along with differing number of recipients, this may be considered as an indication that the sender is a spam user.
  • a user which has subscribed for lOOsms/day subscription plan, and based on behavioral analysis it is identified that the user has been consistently sending lOOsms/day to different recipients for a certain period of time, there is a high probability that the messages are marketing spam messages and that the sender is a spam user.
  • the message feature extractor 140 further includes the content feature extractor 142,
  • the content feature extractor 142 includes a multimedia feature extractor 143 and a text feature extractor 144.
  • the multimedia feature extractor 143 is operative to receive multimedia messages and extract a set of one or more message content features for the multimedia message.
  • the text feature extractor 144 is operative to receive a text message and extract a set of one or more message content feature for the text message.
  • the multimedia feature extractor 143 and the text feature extractor 144 are optional elements, In some embodiments, at least one of the multimedia feature extractor 143 or the text feature extractor 144 is not included in the content feature extractor 142, For example, when the system is operative to offer an SMS service, the multimedia feature extractor may not be included. Other embodiments can be contemplated in which one or the other of the feature extractors is not present. In some embodiments, the two components 143 and 144 may be implemented as a single feature extractor that is adapted to extracts features from the message received regardless of the type of message.
  • the multimedia feature extractor 143 may we use image feature extraction techniques to extract features from a multimedia message. For example, one or more types of objects present in an image can be extracted. A vectorized representation of the objects is used as a set of content features for the multimedia message.
  • the text feature extractor 144 may use word-embeddings to extract the features.
  • the word embeddings of each sentence in a message can be determined to capture the semantics of the message.
  • pre-trained word vectors can be used to extract features of text messages.
  • features such as category of the message (e.g., sales, banking, clothing, etc.) can be extracted.
  • the set of features for a message can be used either during a training phase, in which features of multiple messages are used to form a data set for training the spam prediction model, or during a classification phase for spam detection of the message.
  • the features 111 can be used for classification of the message and during a training phase, in which the message 101A is used in combination with other messages for determining an update of the spam detection model.
  • FIG. 3 illustrates a block diagram of exemplary spam feedback repository and message repository, in accordance with some embodiments.
  • the spam feedback repository 170 includes a spam feedback history 171, a sender location determiner 173, a message flag determiner 175, a message counter 177, and a subscription determiner 179.
  • the user location registry 135 is a database that includes permanent subscriber information for the wireless communication network 100.
  • the information includes the location of the user.
  • the user location registry 135 can be a home location register (HLR).
  • the HLR is a central database that contains details of each wireless device user that is authorized to use the network 100.
  • the HLRs store details of every SIM card issued by the mobile phone operator.
  • Each SIM has a unique identifier called an IMSI which is the primary key to each HLR record.
  • the database 135 can include a visitor location register (VLR).
  • the VLR contains information about the users that subscribe to services of the network 100 and which are roaming within a mobile switching center’s (MSC) location area.
  • MSC mobile switching center’s
  • the spam feedback repository 170 receives spam feedback 109 from one or multiple WDs, e.g., WDs 102B-N.
  • the spam feedback includes an identifier of the message, and a spam variable(s).
  • the identifier of the message can include a unique identifier that identifies the message in the message repository 130. For example, each one of the messages 133 stored in the message repository 130, message 1-3 are associated with a respective message identifier 132. The message identifier can be generated at the time the message is first received at the message management center 110 and stored in the message repository 130.
  • Message 1 illustrates an example message as stored in the message repository 130, the message 1 includes a message time stamp (that is the time at which the message was generated, or received), a sender identifier (sender ID), a receiver identifier (receiver ID), and a message content (which can include text and/or multimedia data).
  • a message time stamp that is the time at which the message was generated, or received
  • sender ID sender identifier
  • receiver identifier receiver identifier
  • message content which can include text and/or multimedia data
  • the spam feedback 109 can also include a sender identifier, a receiver identifier, a time indicator of the message (e.g., the message timestamp).
  • the additional information such as the sender’s identifier, the receiver’s identifier, the time indicator may not be included in the spam feedback 109 and can be determined from the message repository 130 based on the message identifier.
  • the spam variables include an indication of whether the message identified by the message identifier is spam.
  • the spam variable may include one or more variables.
  • a first variable can be used to determine whether the message includes spam content.
  • a second variable can be used to determine whether the sender of the message is a spam user.
  • the spam variable can be determined based on one or multiple spam flags that were transmitted to the recipient for the message.
  • the spam variables are the spam flags and there is no additional parameter that is stored in the spam feedback history 171.
  • the spam variables are parameters that indicates whether the recipient agreed with the spam flags received or not.
  • the recipient may have received a notification indicating that the message is a spam message or alternatively that the sender of the message is a spam user, and the spam variable include a confirmation that the received spam flag is accurate or alternatively that the spam flag was erroneous.
  • the spam variable may include one or more spam flags that are set by the recipient without having received the notifications for the message. In other words, in these embodiments, even when no spam flag is transmitted to the recipient (e.g., either because the message was not classified or because the classification did not result in a spam flag being transmitted to the recipient of the message) the recipient can transmit one or more spam flags indicating whether the recipient considers the message as including spam content, the sender as being a spam user, or both.
  • Figure 3 illustrates an example in which the spam variables 176 indicate whether a spam flag received was accurate or not. For example, when a spam variable has a value of zero this is an indication that the spam flag was accurate, alternatively when a spam variable has a value of 1 this is an indication that the spam flag was erroneous ⁇
  • a spam flag can be erroneous if the recipient of the spam flag does not agree with the categorization of the message or the sender as spam. The spam flag is accurate, when the recipient agrees that the sender or the message are spam.
  • the spam feedback history 171 includes for each message identifier 172, a respective spam flag 174 indicating one or more spam flags that were sent to the recipient for the message identified by the message identifier, a spam variable 176, which can include one or more values.
  • the spam feedback history 171 may further include a sender’s location 178 and/or receiver subscription information 181.
  • the spam feedback repository 170 can include a sender location determiner 173.
  • the sender location determiner 173 is operative to access the user location registry 135 and obtain a location of the sender of the message. In some embodiments, the sender location determiner 173 is not present.
  • the spam feedback repository 170 may also include a message flag determiner 175.
  • the message flag determiner 175 enables other components to access the spam feedback repository 170 and determine the spam flags and/or the spam variables of one or more messages.
  • the message counter 177 enables the determination of a number of messages for a given sender, receiver in a given interval of time.
  • the subscription determiner 179 allows the determination of subscription information for the message.
  • the subscription information can be information about the sender’s subscription or the receiver’s subscription.
  • the sender location determiner 173, the message flag determiner 175, the message counter 177, and the subscription determiner 179 are optional elements and one or more of these elements is not present in the spam feedback repository 170 in some embodiments.
  • one or more of these elements can be included in the message feature extractor 140.
  • the spam feedback 109 is stored in the spam feedback repository 170, the feedback is transmitted to the prediction model generator 160 to be used as a feature in the determination of a spam prediction model 152.
  • the message management center 110 includes a prediction model generator 160.
  • the prediction model generator 160 is operative to perform a learning phase that enables generation of a spam prediction model.
  • the generation of the spam prediction model includes the determination of optimal weights for the spam prediction model.
  • the optimal weights that are generated allow the determination for a given set of features of a message, of whether the content of the message includes spam content and whether the sender of the message is a spam user.
  • Figure 4 illustrates a block diagram of an exemplary prediction model generator, in accordance with some embodiments.
  • the learning phase of a prediction model is performed based on a data set of inputs and expected outputs for the prediction model.
  • a prediction model can be trained based on messages and known spam labels for the messages.
  • the spam labels can indicate whether the message includes spam content, whether the message has a sender that is a spam user, or whether the message both includes spam content and has a sender that is a spam user.
  • a first phase two separate prediction model determiner can be used to determine separate user and content prediction models.
  • an aggregate model is generated.
  • Figure 4 includes a user prediction model determiner 161A and a content prediction model determiner 16 IB.
  • the user prediction model determiner 161 A is operative to determine a user prediction model.
  • the user prediction model can be used to determine whether the sender of a message is a spam user or not based on features of the message.
  • the content prediction model determiner 16 IB is operative to determine a content prediction model.
  • the content prediction model can be used to determine whether the message includes spam content or not based on features of the message.
  • the user prediction model determiner 161A includes a user data set generator 162A and a user prediction model generator 164 A.
  • the user data set generator 162A is operative to generate a user data set 163 A that include for each set of features of a message a corresponding indication of spam for the sender of the message.
  • the user data set 163A is generated based on spam feedback received from the users to which messages are sent.
  • the user data set 163A can be generated based on external sources, which enable classification of senders.
  • the user data set generator 162A may have access to an external spam user identifier 169 (e.g., database including IDs of spam users).
  • the user data set generator 162A may transmit a user ID to the external spam user identifier 169 and receive an indication of spam user or not indicating whether a particular message sender is a spam user.
  • the data set can be used in the user prediction model generator 164 A to determine one or more user prediction models 165A-1 to 165A-N.
  • Each one of the user prediction models 165A-1 to 165A-N is defined based on a set of user-based weights that are to be applied to message features to perform spam detection.
  • the user prediction model generator 164A is operative to determine a user prediction model per location.
  • the location can be the location of the sender or the location of the receiver. In some embodiments, for example depending on the scale of the location, the location of the sender and the location of the receiver can be different locations. In other embodiments, the location of the sender and the location of the receiver are the same, The user prediction model generated for the location can then be used for spam detection in all messages received from senders at that location.
  • a location can be a city, a neighborhood, a street, or any other scale for performing spam detection of senders at that location based on a same model.
  • the user prediction model generator 164A generates multiple user prediction models, each model for a different location. In other embodiments, the user prediction model generator 164 A can generate a user prediction model for a given user. In these embodiments, the data set used for the generation of the user prediction model is built with data related to usage of a single receiver of messages.
  • the content prediction model determiner 16 IB includes a content data set generator 162B and a content prediction model generator 164B.
  • the content data set generator 162B is operative to generate a content data set 163B that include for each set of features of a message a corresponding indication of spam for the content of the message.
  • the user data set 163B is generated based on spam feedback received from the users to which messages are sent.
  • the content data set 163B can be generated based on external sources, which enable classification of senders.
  • the content data set generator 162B may have access to an external spam content classifier (not illustrated).
  • the content data set generator 162B may transmit a message to the external content classifier and receive an indication of spam content or not indicating whether a particular message includes spam or not.
  • the data set can be used in the content prediction model generator 164B to determine one or more content prediction models 165B-1 to 165B-N.
  • Each one of the content prediction models 165B-1 to 165B-N is defined based on a set of content-based weights that are to be applied to message features to perform spam detection.
  • the content prediction model generator 164B is operative to determine a content prediction model per location.
  • the location can be the location of the sender or the location of the receiver. In some embodiments, for example depending on the scale of the location, the location of the sender and the location of the receiver can be different locations. In other embodiments, the location of the sender and the location of the receiver are the same.
  • the content prediction model generated for the location can then be used for spam detection in all messages received from senders at that location.
  • a location can be a city, a neighborhood, a street, or any other scale for performing spam detection of senders at that location based on a same model.
  • the content prediction model generator 164B generates multiple user prediction models, each model for a different location. In other embodiments, the content prediction model generator 164B can generate a content prediction model for a given user. In these embodiments, the data set used for the generation of the user prediction model is built with data related to usage of a single receiver of messages.
  • an aggregate prediction model is determined at the aggregate prediction model determiner 166.
  • the aggregate weights determiner 167 determines based on the user-based weights and the content-based weights a set of aggregate weights for the features.
  • the aggregate weights are used for the definition of an aggregate prediction model.
  • the aggregate prediction model e.g., spam prediction model 152, is used to determine for a message, whether the sender of the message is a spam user and whether the message includes content spam.
  • the output of the spam prediction model can be a first indication that the message include spam content, a second indication that the sender of the message is a spam user or both.
  • the use of the aggregate prediction model i.e., spam prediction model 152, significantly reduces the time taken for classifying incoming messages and thus makes it feasible to add to a real time solution for spam detection in a message management center 110.
  • Several mechanisms can be used to determine the aggregate weights from the user- based weights and the content-based weights. For example, the aggregate weights can be determined based on the following equation:
  • the parameter a & b can be determined based on machine learning mechanisms such as least squares, random forest, neural networks, etc. The determination is done based on the spam feedback received from the receiver that indicate whether a message was from a spam user or its content was spam. In some embodiments, the operations described with reference to figure 4 are continuously performed such that the aggregate weights and consequently the aggregate prediction model (i.e., the spam prediction model) is continuously updated based on feedback received from the receivers of the messages,
  • the proposed method uses both combinations of the user-based features (sender/receiver identifications, respective location, etc,) as well as the contextual features (industrial/educational, sender location type - derived from additional data sources, sender subscription type, sender VAS subscription parameters, message content, message length/size, number of conversation exchanges, and time of the message, etc.) to build a spam prediction model and perform spam detection of messages in a messaging service of a communication network.
  • Figure 5 illustrates a flow diagram of exemplary operators for message spam detection in accordance with some embodiments.
  • the message management center 110 receives, at operation 502, from a first device of a first sender, a message addressed to one or more receivers.
  • the message management center 110 determines, at operation 504, a set of features for the message.
  • the set of features includes at least one or more content features that relates to content of the message, and one or more user features that relate to the first sender.
  • the message management center 110 determines 506, based on a spam prediction model and the set of features for the message, whether the first sender is a spam sender and whether the message includes spam content.
  • the message management center 110 transmits, at operation 508, a notification, e.g., notifications 103A-K, to one or more wireless devices of the one or more receivers including the at least one of a user flag and a message flag.
  • a notification e.g., notifications 103A-K
  • the user flag indicates that the first sender is a spam user and the message flag indicates that the message includes spam content.
  • the notifications 103A-K include one or more flags that indicates whether a sender of a message 105A-L is a spam user or alternatively that indicates whether the message includes spam content.
  • the notifications e.g..
  • notifications 103A-K may include (operation 510) the message itself in addition to the spam flags.
  • the notifications 103A-K may include only the flag(s), or the flag(s) and one or more additional information related to the message or the sender that caused the generation of the spam flag.
  • the additional information can include an identifier of the sender.
  • the message management center 110 transmits, at operation 512, the message to the one or more devices of the one or more receivers.
  • Figure 6 illustrates a flow diagram of exemplary notifications that can be sent to a user following detection of message spam in accordance with some embodiments.
  • the notifications are determined for a given message based on a spam prediction performed for the message based on content-based features and user-based features.
  • the notifications includes, operation 602, the user flag only when the user flag indicates that the first sender is a spam sender and the message flag does not indicate that the message includes spam content.
  • the notifications include, operation 604, the message flag only when the message flag indicates that the message includes spam content and the user flag does not indicate that the first sender is a spam user.
  • the notifications include, operation 606, the message flag and the user flag when the message flag indicates that the message includes spam content and the user flag indicates that the first sender is a spam user.
  • the notifications may further include a single spam flag.
  • the spam flag may result from the determination that the message includes spam content or that the sender of a message is a spam user.
  • the spam flag may not indicate, which of spam content or a spam user causes the generation of the spam flag.
  • the receiver of the notification upon receipt of the spam flag may request to obtain more information.
  • the WD receiving the notification e.g., WD 102B, may transmit a request to the message management center 110 for additional information regarding the message that resulted in the transmission of the notification.
  • the request may include the message identifier.
  • the message management center 110 is operative to retrieve the more detailed spam flags, a first flag indicating that the sender is a spam user and/or a second flag indicating that the message includes spam content. The message management center 110 may then transmit to the WD a second notification including the more detailed spam flags.
  • FIG. 7 illustrates a flow diagram of exemplary operations for receiving spam feedback from a device e.g. WD 102A, in accordance with some embodiments.
  • the WD Upon receipt of notifications from the message management center 110, the WD is operative to transmit feedback to the message management center 110.
  • a messaging application included in the WD includes a user interface with graphical elements enabling a user of the WD to select one or more graphical elements for transmitting feedback related to the spam notifications received.
  • a graphical element can indicate that the user agrees with the spam flag received.
  • another graphical element can indicate that the user does not agree with the spam flags received.
  • the messaging application may further include enhanced functionality in the messaging transmission protocol for transmitting the spam feedback to the messaging management center 110.
  • the WD 102B may report spam feedback using enhanced to SMS- COMMAND or a new SMS-REPORT.
  • the operations of Figure 7 may be performed.
  • the message management center 110 e.g., message gateway 120, receives, at operation 702, from a second device of the wireless devices, e.g., WD 102B, spam feedback that indicates whether a second user of the second wireless device agrees with the user flag and the message flag.
  • the spam feedback can include spam variables.
  • the spam feedback includes, at operation 712, a first indication that the second user agrees that the sender is a spam user.
  • the spam feedback includes, at operation 714, a second indication that the second user agrees that the message includes spam content.
  • the spam feedback includes, at operation 716, a first indication that the second user agrees that the sender is a spam user and a second indication that the second user agrees that the message includes spam content,
  • the message management center 110 uses the spam feedback as a feature for training the spam prediction model to obtain an updated spam prediction model that is to be used for determining, for new messages received, whether senders and message contents are respectively spam users or include spam content.
  • the spam feedback can be used to generate new data sets and the new data sets are used during a new training phase.
  • the result of the new training phase can be an updated prediction model that is adapted to feedback received from the user. This may result in messages determined as spam messages according to a first prediction model to no longer be determined as spam according to the updated model or vice versa.
  • senders determined as spam users according to the first prediction model that are no longer determined to be spam users according to the updated prediction model or alternatively senders that were not determined to be spam users according to the first prediction model that are determined to be spam users according to the updated prediction model.
  • the embodiments presented herein allow for more flexibility in spam detection such as a user may adapt their behavior depending on messages that they want to receive.
  • the system automatically determines new prediction models based on the feedback received.
  • the message management center 110 may transmit the messages that caused the transmission of the user spam notification to the receiver.
  • the message management center 110 may determine based on a message identifier included in the spam feedback, the message that caused the transmission of the spam flag to the WD, retrieve the message from the message repository 130 and transmit the message to the WD.
  • the receiver may further include a request to retrieve the message that caused the transmission of the user spam notification to the receiver. The request can be PULL/RETRIEVE command identifying the message to be retrieved or identifying the sender of the message, or both.
  • Figure 7 illustrates exemplary operations performed by the message management center 110 upon receipt of spam feedback from a WD that has received one or multiple spam flags from the message management center 110
  • Figure 8 illustrates a flow diagram of exemplary operations for receiving spam feedback from another device, when this other device has not received any spam flag for a given message. The operations of Figure 8 may be performed following the transmission of a message to a device.
  • the WD Upon receipt of the message from the message management center 110, the WD tran mits feedback to the message management center 110.
  • a messaging application included in the WD includes a user interface with graphical elements enabling a user of the WD to select one or more graphical elements for transmitting feedback related to the spam notifications received.
  • a graphical element can indicate that the message is spam.
  • another graphical element can indicate that the sender of the message is a spam user.
  • another graphical element can indicate that the content of the message is spam.
  • Different types of graphical elements and graphical user interfaces can be used for enabling the user of the WD to transmit feedback to the message management center 110.
  • the messaging application may further include enhanced functionality in the messaging transmission protocol for transmitting the spam feedback to the messaging management center 110.
  • the WD 102B may report spam feedback using enhanced to SMS-COMMAND or a new SMS- REPORT.
  • the message management center 110 receives from a third device, second spam feedback that indicates at least one of whether the sender of a message received by the device is a spam user and whether the message received at the third device includes spam content.
  • the spam feedback includes, at operation 812, a third indication that the second sender is a spam user.
  • the spam feedback includes, at operation 814, a fourth indication that the second message includes spam content.
  • the spam feedback includes, at operation 816, the third indication that the second sender is a spam user and the fourth indication that the second message includes spam content.
  • the message management center 110 uses the spam feedback as a feature for training the spam prediction model to obtain an updated spam prediction model that is to be used for determining, for new messages received, whether senders and message contents are respectively spam users or include spam content.
  • the spam feedback can be used to generate new data sets and the new data sets are used during a new training phase.
  • the result of the new training phase can be an update prediction model that is adapted to feedback received from the user. This may result in messages determined as spam messages according to a first prediction model to no longer be determined as spam according to the updated model or vice versa.
  • senders determined as spam users according to the first prediction model that are no longer determined to be spam users according to the updated prediction model or alternatively senders that were not determined to be spam users according to the first prediction model that are determined to be spam users according to the updated prediction model.
  • the embodiments presented herein allow for more flexibility in spam detection such as a user may adapt their behavior depending on messages that they want to receive.
  • the system automatically determine new prediction models based on the feedback received.
  • the message management center 110 may transmit the messages that caused the transmission of the user spam notification to the receiver.
  • the message management center 110 may determine based on a message identifier included in the spam feedback, the message that caused the transmission of the spam flag to the WD, retrieve the message from the message repository 130 and transmit the message to the WD.
  • An electronic device stores and transmits (internally and/or with other electronic devices over a network) code (which is composed of software instructions and which is sometimes referred to as computer program code or a computer program) and/or data using machine-readable media (also called computer-readable media), such as machine-readable storage media (e.g., magnetic disks, optical disks, solid state drives, read only memory (ROM), flash memory devices, phase change memory) and machine-readable transmission media (also called a carrier) (e.g., electrical, optical, radio, acoustical or other form of propagated signals - such as carrier waves, infrared signals).
  • machine-readable media also called computer-readable media
  • machine-readable storage media e.g., magnetic disks, optical disks, solid state drives, read only memory (ROM), flash memory devices, phase change memory
  • machine-readable transmission media also called a carrier
  • carrier e.g., electrical, optical, radio, acoustical or other form of propagated signals - such as carrier waves, inf
  • an electronic device e.g., a computer
  • hardware and software such as a set of one or more processors (e.g., wherein a processor is a microprocessor, controller, microcontroller, central processing unit, digital signal processor, application specific integrated circuit, field programmable gate array, other electronic circuitry, a combination of one or more of the preceding) coupled to one or more machine-readable storage media to store code for execution on the set of processors and/or to store data.
  • processors e.g., wherein a processor is a microprocessor, controller, microcontroller, central processing unit, digital signal processor, application specific integrated circuit, field programmable gate array, other electronic circuitry, a combination of one or more of the preceding
  • an electronic device may include non-volatile memory containing the code since the non-volatile memory can persist code/data even when the electronic device is turned off (when power is removed), and while the electronic device is turned on that part of the code that is to be executed by the processor(s) of that electronic device is typically copied from the slower non-volatile memory into volatile memory (e.g., dynamic random access memory (DRAM), static random access memory (SRAM)) of that electronic device.
  • Typical electronic devices also include a set or one or more physical network interface(s) (NI(s)) to establish network connections (to transmit and/or receive code and/or data using propagating signals) with other electronic devices.
  • NI(s) physical network interface
  • a physical NI may comprise radio circuitry capable of receiving data from other electronic devices over a wireless connection and/or sending data out to other devices via a wireless connection.
  • This radio circuitry may include transmitter(s), receiver(s), and/or transceiver(s) suitable for radiofrequency communication.
  • the radio circuitry may convert digital data into a radio signal having the appropriate parameters (e.g., frequency, timing, channel, bandwidth, etc.).
  • the radio signal may then be transmitted via antennas to the appropriate reeipient(s).
  • the set of physical NI(s) may comprise network interface controller(s) (NICs). also known as a network interface card, network adapter, or local area network (LAN) adapter.
  • NICs network interface controller
  • the NIC(s) may facilitate in connecting the electronic device to other electronic devices allowing them to communicate via wire through plugging in a cable to a physical port connected to a NIC.
  • One or more parts of an embodiment of the inventive concept may be implemented using different combinations of software, firmware, and/or hardware.
  • wireless device refers to a device capable, configured, arranged and/or operable to communicate wirelessly with network nodes and/or other wireless devices. Communicating wirelessly may involve transmitting and/or receiving wireless signals using electromagnetic waves, radio waves, infrared waves, and/or other types of signals suitable for conveying information through air.
  • a WD may be configured to transmit and/or receive information without direct human interaction. For instance, a WD may be designed to transmit information to a network on a predetermined schedule, when triggered by an internal or external event, or in response to requests from the network.
  • Examples of a WD include, but are not limited to, a smart phone, a mobile phone, a cell phone, a voice over IP (VoIP) phone, a wireless local loop phone, a desktop computer, a personal digital assistant (PDA), a wireless cameras, a gaming console or device, a music storage device, a playback appliance, a wearable terminal device, a wireless endpoint, a mobile station, a tablet, a laptop, a laptop -embedded equipment (LEE), a laptop-mounted equipment (LME), a smart device, a wireless customer-premise equipment (CPE) a vehicle-mounted wireless terminal device, etc.
  • VoIP voice over IP
  • PDA personal digital assistant
  • a wireless cameras a gaming console or device
  • a music storage device a playback appliance
  • a wearable terminal device a wireless endpoint
  • a mobile station a tablet, a laptop, a laptop -embedded equipment (LEE), a laptop-mounted equipment (LME), a smart
  • a WD may support device-to-device (D2D) communication, for example by implementing a 3GPP standard for sidelink communication, vehicle-to-vehicle (V2V), vehicle-to-infrastructure (V2I), vehicle- to-everything (V2X) and may in this case be referred to as a D2D communication device.
  • D2D device-to-device
  • V2V vehicle-to-vehicle
  • V2I vehicle-to-infrastructure
  • V2X vehicle- to-everything
  • a WD may represent a machine or other device that performs monitoring and/or measurements, and transmits the results of such monitoring and/or measurements to another WD and/or a network node.
  • the WD may in this case be a machine-to-machine (M2M) device, which may in a 3GPP context be referred to as an MTC device.
  • M2M machine-to-machine
  • the WD may be a UE implementing the 3GPP narrow band internet of things (NB-IoT) standard.
  • NB-IoT narrow band internet of things
  • machines or devices are sensors, metering devices such as power meters, industrial machinery, or home or personal appliances (e.g. refrigerators, televisions, etc.) personal wearables (e.g., watches, fitness trackers, etc,).
  • a WD may represent a vehicle or other equipment that is capable of monitoring and/or reporting on its operational status or other functions associated with its operation.
  • a WD as described above may represent the endpoint of a wireless connection, in which case the device may be referred to as a wireless terminal.
  • a WD as described above may be mobile, in which case it may also be referred to as a mobile device or a mobile terminal.
  • each one of the WD 102A-N can be a wireless device.
  • each one of the WD 102A-N can be a network device that is communicatively coupled with the message management center via a wired connection as opposed to a wireless connection.
  • the WD 102A-N may be coupled with the message management center 110 through one or more additional network devices that are not illustrated.
  • a network device is an electronic device that communicatively interconnects other electronic devices on the network (e.g., other network devices, end-user devices).
  • Some network devices are“multiple services network devices” that provide support for multiple networking functions (e.g., routing, bridging, switching, Layer 2 aggregation, session border control, Quality of Service, and/or subscriber management), and/or provide support for multiple application services (e.g,, data, voice, and video, etc.).
  • each one of the message gateway 120, the message repository 130, the message feature extractor 140, the message spam manager 150, and the prediction model generator 160, and the spam feedback repository 170 can be implemented on one or more network devices.
  • At least two of the message gateway 120, the message repository 130, the message feature extractor 140, the message spam manager 150, and the prediction model generator 160, and the spam feedback repository 170 can be implemented on the same network device.
  • each one of the message gateway 120, the message repository 130, the message feature extractor 140, the message spam manager 150, and the prediction model generator 160, and the spam feedback repository 170 is implemented on a separate one of multiple network devices.
  • FIG. 9 illustrates a block diagram for a network device that can be used for implementing one or more components of the message management center described herein, in accordance with some embodiments.
  • the network device 930 may be a web or cloud server, or a cluster of servers, running on server hardware.
  • the network device is a server device which includes hardware 905.
  • Hardware 905 includes one or more processors 914, network communication interfaces 960 coupled with a computer readable storage medium 912.
  • the computer readable storage medium 912 may include the message feature extractor code 932, the message spam manager code 933, and the prediction model generator code 936, and the spam feedback repository 934.
  • network device 930 may include additional components beyond those shown in Figure 9 that may be responsible for providing certain aspects of the message management center functionality, including any of the functionality described herein and/or any functionality necessary to support the subject matter described herein.
  • network device 930 may include user interface equipment to allow input of information into network device 930 and to allow output of information from network device 930. This may allow a user to perform diagnostic, maintenance, repair, and other administrative functions for network device 930,
  • FIG. 940 While one embodiment does not implement virtualization, alternative embodiments may use different forms of virtualization - represented by a virtualization layer 920.
  • the instance 940 and the hardware that executes it form a virtual server which is a software instance of the modules stored on the computer readable storage medium 912.
  • the message feature extractor code 932, the message spam manager code 933, and the prediction model generator code 936 includes instructions which when executed by the hardware 905 causes the instance 940 to implement message feature extractor 952, a message spam manager 953, and prediction model generator 956 that is operative to perform the operations performed by the components of the message management center 110 described with reference to Figures 1-8.
  • FIG. 10 illustrates a block diagram for a device that can be used for implementing one or more devices described herein, in accordance with some embodiments.
  • the device 1030 is an electronic device that is adapted to communicate with a message management center and transmit and receive messages through a messaging service.
  • the device 1030 includes hardware 1005.
  • Hardware 1005 includes one or more processors 1014, network communication interfaces 1060 coupled with a computer readable storage medium 1012.
  • the computer readable storage medium 1012 may include a messaging application code 1035 that includes spam feedback user interface code 1032, and spam feedback transmission protocol code 1033.
  • device 1030 may include additional components beyond those shown in Figure 10 that may be responsible for providing certain aspects of the message management center functionality, including any of the functionality described herein and/or any functionality necessary to support the subject matter described herein.
  • device 1030 may include user interface equipment to allow input of information into device 1030 and to allow output of information from network device 930.
  • FIG. 1010 While one embodiment does not implement virtualization, alternative embodiments may use different forms of virtualization - represented by a virtualization layer 1020.
  • the instance 1040 and the hardware that executes it form a virtual server which is a software instance of the modules stored on the computer readable storage medium 1012.
  • the messaging application code 1035 that includes spam feedback user interface code 1032, and spam feedback transmission protocol code 1033 includes instructions which when executed by the hardware 1005 causes the instance 1040 to implement the messaging application 1055 that includes spam feedback user interface 1052, and spam feedback transmission protocol 1053 that is operative to perform the operations performed by the wireless devices described with reference to Figures 1-8.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Data Mining & Analysis (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

A method and message management center (MMC) for spam detection are described. The MMC receives a message addressed to one or more receivers. The MMC determines a set of features for the message including content features and user features. The content features relate to the content of the message, and the user features relate to the sender of the message. The MMC determines, based on a spam prediction model and the set of features for the message, whether the first sender is a spam sender and whether the message includes spam content. Responsive to determining that at least one of the first sender is a spam sender and the message includes spam content, the MMC transmits a notification to one or more wireless devices including at least one of a user flag and a message flag. The user flag indicates that the first sender is a spam user and the message flag indicates that the message includes spam content.

Description

METHOD AND SYSTEM FOR MESSAGE SPAM DETECTION IN
COMMUNICATION NETWORKS
TECHNICAL FIELD
[0001] The present disclosure relates to the field of network communication; and more specifically, to message spam detection in communication networks.
BACKGROUND
[0002] SMS (short message service) is a text messaging service supported by most wireless communication systems. SMS uses communication protocols to enable wireless devices to send and receive short text messages. Multimedia Messaging Service (MMS) is another mechanism that can be used to send messages to and from a wireless phone over a wireless communication network. The MMS is a standard protocol that extends the core SMS capability, allowing the exchange of text messages greater in length. Unlike text-only SMS, MMS can deliver a variety of media, including videos, images, or audio.
[0003] A spam message (e.g., SMS spam or MMS spam) is an unwanted or an unsolicited message (such as SMS messages or MMS messages) that is sent indiscriminately to a device. Often, a spam message is sent for marketing purposes. A spam message can take the form of a simple message, a link to a number to call or text, a multimedia content, a link to a website for more information or a link to a website to download an application.
[0004] Spam messages are problems to users of devices as they bombard the users with irrelevant or uncalled for messages draining time and resources. Some mechanisms exist today for detection of spam messages. Some existing spam detection mechanisms use the content of the message to detect a spam message. These approaches are only interested in classifying the content of the message as being spam or not without any consideration of or focus on the sender which may also be a spam sender. Some other approaches of spam detection, such as Do Not Disturb [ DND] set by operators, either block all the messages from a sender or unblock the messages based on the user interests and registrations. In another spam detection approach, the location of the sender of the message can also be considered in addition to the content of the message sent to determine whether the message is a spam message or not. However, this spam detection approach similarly to the other ones is only interested in classifying the message as a spam or not rather than classifying the user. SUMMARY
[0005] One general aspect includes a method performed by a network node for spam detection in a communication network, the method including: receiving, from a first device of a first sender, a message addressed to one or more receivers; determining a set of features for the message, where the set of features includes at least one or more content features that relates to content of the message, and one or more user features that relate to the first sender; determining, based on a spam prediction model and the set of features for the message, whether the first sender is a spam sender and whether the message includes spam content; responsive to determining that at least one of the first sender is a spam sender and the message includes spam content, transmitting a notification to one or more devices of the one or more receivers including the at least one of a user flag and a message flag, where the user flag indicates that the first sender is a spam user and the message flag indicates that the message includes spam content; and responsive to determining that the first sender is not a spam sender and the message does not include spam content, transmitting the message to the one or more devices of the one or more receivers.
[0006] One general aspect includes a network node for spam detection in a communication network, the network node including: one or more processors; and non- transitory computer readable storage media that stores instructions, which when executed by the one or more processors cause the network node to: receive, from a first device of a first sender, a message addressed to one or more receivers; determine a set of features for the message, where the set of features includes at least one or more content features that relates to content of the message, and one or more user features that relate to the first sender; determine, based on a spam prediction model and the set of features for the message, whether the first sender is a spam sender and whether the message includes spam content. The network node is further to, responsive to determining that at least one of the first sender is a spam sender and the message includes spam content, transmit a notification to one or more devices of the one or more receivers including the at least one of a user flag and a message flag, where the user flag indicates that the first sender is a spam user and the message flag indicates that the message includes spam content; responsive to determining that the first sender is not a spam sender and the message does not include spam content, transmit the message to the one or more devices of the one or more receivers. BRIEF DESCRIPTION OF THE DRAWINGS
[0007] The inventive concept may best be understood by referring to the following description and accompanying drawings that are used to illustrate embodiments of the inventive concept. In the drawings:
[0008] Figure 1 illustrates a block diagram of an exemplary system for enabling message spam detection in a communication network, in accordance with some embodiments.
[0009] Figure 2 illustrates a block diagram of a message feature extractor, in accordance with some embodiments.
[0010] Figure 3 illustrates a block diagram of exemplary spam feedback repository and message repository, in accordance with some embodiments.
[0011] Figure 4 illustrates a block diagram of an exemplary prediction model generator, in accordance with some embodiments.
[0012] Figure 5 illustrates a flow diagram of exemplary operators for message spam detection in accordance with some embodiments.
[0013] Figure 6 illustrates a flow diagram of exemplary notification that can be sent to a user following detection of message spam in accordance with some embodiments.
[0014] Figure 7 illustrates a flow diagram of exemplary operations for receiving spam feedback from a device, in accordance with some embodiments.
[0015] Figure 8 illustrates a flow diagram of exemplary operations for receiving spam feedback from another device, in accordance with some embodiments.
[0016] Figure 9 illustrates a block diagram for a network device that can be used for implementing one or more of the network devices described herein, in accordance with some embodiments.
[0017] Figure 10 illustrates a block diagram for a device that can be used for implementing one or more devices described herein, in accordance with some embodiments.
DETAIFED DESCRIPTION
[0018] The following description describes methods and apparatus for message spam detection in communication networks. In the following description, numerous specific details such as logic implementations, opcodes, means to specify operands, resource partitioning/sharing/duplication implementations, types and interrelationships of system components, and logic partitioning/integration choices are set forth in order to provide a more thorough understanding of the present disclosure. It will be appreciated, however, by one skilled in the art that the present disclosure may be practiced without such specific details. In other instances, control structures, gate level circuits and full software instruction sequences have not been shown in detail in order not to obscure the present disclosure. Those of ordinary skill in the art, with the included descriptions, will be able to implement appropriate functionality without undue experimentation.
[0019] References in the specification to“one embodiment,” “an embodiment,” “an example embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
[0020] Bracketed text and blocks with dashed borders (e.g., large dashes, small dashes, dot-dash, and dots) may be used herein to illustrate optional operations that add additional features to embodiments of the inventive concept. Flowever, such notation should not be taken to mean that these are the only options or optional operations, and/or that blocks with solid borders are not optional in certain embodiments of the inventive concept.
[0021] In the following description and claims, the terms“coupled” and“connected,” along with their derivatives, may be used. It should be understood that these terms are not intended as synonyms for each other.“Coupled” is used to indicate that two or more elements, which may or may not be in direct physical or electrical contact with each other, co-operate or interact with each other.“Connected” is used to indicate the establishment of communication between two or more elements that are coupled with each other,
General Overview:
[0022] Today’s telecommunication operators do not host or run spam detection systems as part of telecom infrastructure. There is no automated SPAM detection system (such as SMS/MMS messages) that enables the detection of spam messages exchanged in a communication network where the detection is based both on the sender as well as based on the content of the message. Therefore, there is a need for a spam detection system that enhances the experience of a user of a messaging system in a communication network. Further, there is a need for a spam detection system that not only detects spam messages as including spam content but can also detect spam messages as messages providing from a spam sender. [0023] The embodiments herein present a mechanism for automatically classifying a sender of a message in a communication network as being a spam user or the message received from this sender as being a spam message. This allows the detection of spam users which can send a normal message (e.g.. a bank transaction related message), the detection of a normal user who sent a spam message (e.g., a marketing related message), or the detection of a spam user that sent a message including spam content.
[0024] The embodiments herein present several advantages when compared with existing spam detection solutions. The proposed solutions enhance mobile network user experience by providing a more nuanced information on the type of spam that can be received by a user. By transmitting a flag identifying whether a sender of a message is a spam user and/or another flag identifying whether the message includes spam content, the user is presented with a more detailed classification of the messages they receive. In some embodiments, the solution presented herein enable the transmission of a notification to a receiver of a message for notifying the receiver that there is a potential of spam. This notification can include the message and one or more additional spam flags for the message that indicate whether one or both of the message content and the sender are spam. Alternatively, the notification may not include the message, e.g., the message is blocked, and the notification only includes the spam flag(s) to inform the user that there is a potential of a user spam or content spam. In addition, the mechanisms described herein enable users that receive messages to report a message and/or a sender of the message as spam.
[0025] The solution presented herein provides a dynamic spam detection platform that is adaptable to feedback received from the end-user. The dynamic spam detection platform can be used in a messaging service platform that can be either a SMS-based service or an MMS- based service. As opposed to static configuration-based filtering solutions present in current spam detection mechanisms, the current solution adapts to the actual user experience resulting in an improved user experience of the messaging service,
[0026] Figure 1 illustrates a block diagram of an exemplary system for enabling message spam detection in a communication network, in accordance with some embodiments.
[0027] Although the subject matter described herein may be implemented in any appropriate type of communication network using any suitable components, the embodiments disclosed herein are described in relation to a wireless communication network, such as the example wireless communication network 100 illustrated in Figure 1. For simplicity, the network 100 only depicts the following components; a message management center 110, wireless devices 102A-N, and network 107. In practice, a wireless network may further include any additional elements suitable to support communication between wireless devices or between a wireless device and another communication device, such as a landline telephone, a service provider, or any other network node or end device. The wireless network may provide communication and other types of services to one or more wireless devices to facilitate the wireless devices’ access to and/or use of the services provided by, or via, the wireless network.
[0028] The wireless network 100 may comprise and/or interface with any type of communication, telecommunication, data, cellular, and/or radio network or other similar type of system. In some embodiments, the wireless network may be configured to operate according to specific standards or other types of predefined rules or procedures. Thus, particular embodiments of the wireless network may implement communication standards, such as Global System for Mobile Communications (GSM), Universal Mobile Telecommunications System (UMTS), Long Term Evolution (LTE), and/or other suitable 2G, 3G, 4G, or 5G standards; wireless local area network (WLAN) standards, such as the IEEE 802.11 standards; and/or any other appropriate wireless communication standard, such as the Worldwide Interoperability for Microwave Access (WiMax), Bluetooth, Z-Wave and/or ZigBce standards. In different embodiments, the wireless network 100 may comprise any number of wired or wireless networks, base stations, controllers, wireless devices, relay stations, and/or any other components or systems that may facilitate or participate in the communication of data and/or signals whether via wired or wireless connections.
[0029] Each one of the wireless devices 102A-102N is an electronic device that is operative to communicate and connect with the message management center 110 through a combination of wireless and wired network technologies. Each one of the wireless devices 102A-102N, also referred to herein as WD 102A-N, is capable of transmitting one or multiple messages, e.g., messages 101A-M, through the network 107 to the message management center 110. The WD 102A-N are also operative to receive messages 105A-L or notifications 103 A-K from the message management center 110. The WD 102A-N are also operative to transmit spam feedback 109 to the message management center 110. While Figure 1 illustrates a set of WDs 102A-N, any number of WDs can transmit and/or receive messages from the message management center. Each one of the WDs can be associated with a user of the device. A user can be a person handling the device and which has a subscription to a wireless service in the network 100. Alternatively, the user of the WD can be a corporation, a hot. or any other entity that is operative to handle/control a wireless device and generate and transmit messages from the wireless device. In the following description a receiver or recipient can be interchangeably used to refer to a user of the WD when the wireless device receives messages. In the following description a sender can be used to refer to the entity that uses a WD for transmitting a message to one or more receivers. The message management center 110 includes a message gateway 120, a message repository 130, a message feature extractor 140, a message spam manager 150, a prediction model generator 160, and a spam feedback repository 170.
[0030] The message gateway 120 is a network node that is operative to receive messages from WDs 102A-N. The message gateway 120 is further operative to transmit messages 105A-L and notifications 103A-K to one or more of the WDs 102A-N. The message gateway 120 is also operative to receive spam feedback 109 from the WDs 102A-N. The messages can be any of short text messages transmitted via a short message service (SMS), or a multimedia messages transmitted via a multimedia message service (MMS). In some embodiments, the MMS extends the core SMS capability, and allows the exchange of text messages greater in length than what is allowed in SMS. Unlike text-only SMS, MMS can deliver a variety of media, including video, images, or audio. The notifications 103A-K include one or more flag that indicates whether a sender of a message 105A-L is a spam user or alternatively that indicates whether the message includes spam content. In some embodiments, the notifications 103A-K may include the message itself. In other embodiments, the notifications 103A-K may include only the flag, or the flag and one or more additional information related to the message or the sender that caused the generation of the spam flag. For example, the additional information can include an identifier of the sender. The spam feedback 109 is a feedback provided by the receiver of a message indicating whether the message received included spam content or originated from a spam user.
[0031] The message management center 110 includes a message repository 130 and a spam feedback repository 170. The message repository 130 is adapted to store the messages 101 A- M received from the WD 102A-N. Each one of the message gateway 120 and the message feature extractor 140 can retrieve one or more messages from the message repository 130 when needed. The spam feedback repository 170 is adapted to store the spam feedback received from the WDs 102A-N. The spam feedback repository 170 is adapted to transmit the spam feedback to the prediction model generator 160. The message management center 110 can be used in a wireless network to provide a messaging service to subscribers/users. The message management center 110 is operative to perform spam detection in the messaging service. [0032] The message feature extractor 140 receives one or more messages from the message repository 130 and determines a set of features 111 for the message. The set of features 111 can include a set of content-based feature and a set of user-based features. The content-based features include features that are determined based on the content of the message received. The user-based features include features determined based on the sender of the message.
[0033] The prediction model generator 160 is a network node that is operative to receive the set of features 111 determined for multiple messages and determine a spam prediction model 113. In some embodiments, the spam prediction model can be determined based on message features as well as based on spam feedback stored in the spam feedback repository 170. The spam prediction model 152 is fed to the message spam manager 150 to be used within the message spam detector 151. Upon determination that the sender of the message is a spam user the message spam detector 151 outputs a user flag indicating that the sender of the message is a spam user. Upon determination that the message includes spam content, the message spam detector 151 outputs a
[0034] The message spam manager 150 includes a message spam detector 151 and a message spam flagger 153. The message spam detector 151 includes the spam prediction model 152 and is operative to determine based on the spam prediction model and the set of features 111 for a message, whether a message spam includes spam content and whether the sender of the message is a spam user. The message spam flagger 153 is operative to receive at least one of the user flag and the message flag for a message and generate a notification to be sent to the received based on these flags. The notification is sent to the message gateway 120 to be transmitted to the receiver, e.g., WD 102B.
[0035] For example, the message spam flagger 153 may add a spam flag entry into the header/footer of a messages that is to be sent to the receiver. This is performed through an enhanced messaging protocol that allows for the transmission of additional spam information with the messages transmitted. Alternatively, the spam flags are transmitted as messages, without the transmission of the spam message.
[0036] In some embodiments, the components 140, 150, 160, and 170 are new elements added to a standard message management center 110. The components 140-170 are operative to perform novel and new operations of spam detection in a message center. While the components 140-170 are shown as part of the message management center 110 as the set 180, in other embodiments, these elements can be separate from the message management center 110 offered as an ad-on to the services of message management center 110. The set of new components 180, which uses blended classification model for detecting spam in messaging services such as SMS or MMS.
[0037] In operation, the message management center 110 is operative to receive, from a first wireless device, e.g., WD 102A, of a first sender (not illustrated), a message 101A addressed to one or more receivers. For example, the message 101 A can be addressed to a user of the WD 102B only, or to the user of the WD 102B and users of one or more additional WDs such as WD 102W. The message management center 110 determines a set of features for the message. The set of features includes content features and user features. The content features relate to the content of the message, and the user features relate to the sender of the message. The message management center 110 determine, based on a spam prediction model, e.g., spam prediction model 152, and the set of features for the message, whether the first sender is a spam sender and whether the message includes spam content. Responsive to determining that at least one of the first sender is a spam sender and the message includes spam content, the message management center 110 transmits a notification to one or more wireless devices, e.g., WD 102B-N, of the one or more receivers including the at least one of a user flag and a message flag. The user flag indicates that the first sender is a spam user and the message flag indicates that the message includes spam content. In some embodiments, the message management center 110 may determine based on the spam prediction model 152 that the message does not include spam content and that the sender of the message is not a spam user, in that case, the message management center 110 transmits the message 101A to the one or more wireless devices of the one or more receivers without any modifications.
[0038] The message management center 110 is further operative to receive spam feedback 109 from the WDs 102A-N and use the spam feedback in the determination of the spam prediction model 152. The spam feedback can be received as a result of the WD receiving a notification and determining whether the notification is accurate or not. Alternatively the spam feedback can be received as a result of the receipt of a message and determining whether the message includes spam content, and/or the sender of the message is a spam user. In some embodiments, the message management center 110 is operative to receive from a WD spam feedback that indicates that the sender is not a spam user, when the WD had received a notification indicating that the sender of a message is a spam user. In these embodiments, the message management center 110 may transmit the message(s) that caused the transmission of the user spam notification to the receiver and which were previously blocked as a result of a determination that the sender of the message(s) is a spam user. In these embodiments, upon receipt of the spam feedback, the message management center 110 may determine based on a message identifier included in the spam feedback, the message that caused the transmission of the spam flag to the WD, retrieve the message from the message repository 130 and transmit the message to the WD,
[0039] The message management center 110 is operative to offer an enhanced message service (e.g.. an enhanced SMS service or MMS service) that enables spam detection in exchanged messages. The embodiments herein present several advantages when compared with existing spam detection solutions. The proposed solutions enhance mobile network user experience by providing a more nuanced information on the type of spam that can be received by a user. By transmitting a flag identifying whether a sender of a message is a spam user and/or another flag identifying whether the message includes spam content, the user is presented with a more detailed classification of the messages they receive. In some embodiments, the solution presented herein enable the transmission of a notification to a receiver of a message for notifying the receiver that there is a potential of spam. This notification can include the message and one or more additional spam flags for the message that indicate whether one or both of the message content and the user are spam. Alternatively, the notification may not include the message, e.g,, the message is blocked, and the notification only includes the spam llag(s) to inform the user that there is a potential of a user spam or content spam. In addition, the mechanisms described herein enable users to report a message and/or a sender of the message as spam.
[0040] In addition, the embodiments herein use a combination of content feature and user feature to determine whether the receiver is receiving spam. This allows a more robust and better-quality spam detection service offered to users of the wireless network 100 as spam may be received from non-spam users and spam users may send non-spam content. The use of content features as well as user features for the detection of spam enables a more nuanced spam detection and notifications.
Message Features Extraction
[0041] Upon receipt of a message 101 A in the message gateway 120 of the message management center 110 from a sender 102A, the message 101 A is stored in the message repository 130. For example, the message can be destined to one or multiple ones of the WDs 102B-N. Prior to being transmitted to the receivers 102B-N, the message is analyzed to determine whether the message includes spam content, or the user is a spam user. The spam detection is performed at least in part based on a set of features 111 extracted by the message feature extractor 140 from the message 101 A. Figure 2 illustrates a block diagram of a message feature extractor, in accordance with some embodiments. The message feature extractor 140 is operative to extract features of message. The features of the message can include content-based features related to the content of the message and user-based features related to the sender of the message. In some embodiments, the features may further include features that relate to the receiver of the message, and one or more additional features that relate to a history of conversation between the sender and the receiver of the message.
[0042] A set of features of a message includes values derived from the message and related information that form an initial set of measured data. These features are intended to be informative and non-redundant values that represent the initial set of measured data. For example, in the current context the initial set of data includes the content of the message (e.g., text, video, audio, and/or image), information on the sender of the message (e.g., location of the sender), information on the receiver of the message, and historical data related to the message and/or the sender/receiver. Each set of features represents a message.
[0043] The features extracted from a message can be used during a learning phase, in which the message management center 110 is operative to learn based on one or more data sets and determine a spam prediction model to be used for spam detection. The data sets are determined based on features of several messages when combined with spam feedback and/or result of external classification/prediction models. The features of a message can further be used during a prediction phase. During the prediction phase spam flags are determined based on the spam prediction model 152 for the message 101A and the features. The features 111 of the message 101A are input into a spam prediction model 152 and the spam prediction model 152 outputs a set of spam flags for the message.
[0044] The message feature extractor 140 includes a content feature extractor 142, a number of message recipients determiner 145, a message sender determiner 146, a time indicator determiner 147, a sender location determiner 148, and a conversation history determiner 149. In some embodiments, the message feature extractor 140 may include a message type determiner 141. For example, when the message management center 110 is adapted to handle SMS messages only, the message type determiner 141 may not be included and the content feature extractor 142 includes only a text feature extractor 144. In these embodiments, the messages received do not include any multimedia content and the extraction of text feature is sufficient to represent the content of the message. In another example, when the message management center 110 is adapted to handle MMS messages, the message type determiner 141 can be included and the content feature extractor 142 includes a text feature extractor 144 and a multimedia feature extractor 143. In these embodiments, the messages received may include both text and multimedia content and a combination of features can be determined to represent the content of the message.
[0045] A message 101A is received by the message feature extractor 140. In some embodiments, the message 101 A can be received as a result of the message feature extractor 140 retrieving one or more messages from the message repository 130. For example, this can occur during a training phase of the spam prediction model 152, in which a data set is to be built based on the features of multiple messages. Alternatively, the message can be received automatically by the message feature extractor 140 without having transmitted a request for a message. For example, this can occur during a classification/prediction phase, in which spam detection of a message is to be performed based on the features of the message.
[0046] The number of message recipients determiner 145 is operative to determine based on the message 101 A, a number of recipients of the message. This feature can be used for training a spam prediction model as a greater number of recipients can increase a probability of a message of being a spam message. For example, the message can be a message forwarded to multiple recipients and may be undesired by the recipients.
[0047] The message sender determiner 146 is operative to receive the message 101A and determine an identifier of a sender of the message. For example, the identifier of the message sender can be a phone number of the message, or any other identifier that uniquely identifies the sender of the message and/or the device from the which the message has been sent. In some embodiments, it may be of interest to identify the WD 102A from which the message was transmitted. In other embodiments, it may be of interest to identify the sender of the message independently of the device that it used as a sender can use one of multiple WDs to transmit messages. The time indicator determiner 147 is operative to receive the message 101A and determine a time indicator for the message. The time indicator can include a time of day, a day of the week, season, or any other time category or indication that corresponds to the time of receipt of the message or to the time of generation of the message. The time indicator can be a time stamp of the message.
[0048] The sender location determiner 148 is operative to receive the message 101A and to determine the location indicator for a sender of the message 101 A. In some embodiments, the location indicator is the location of the sender. In other embodiments, the location indicator can include at least one of the location of the sender and one or more of a location category/type, (e.g., bank, mall, company’s headquarters, industrial location, educational location, etc.)· The location category/type provides enriched contextual information that can be used as a feature in the spam prediction model 152.
[0049] The conversation history determiner 149 is operative to receive the message 101A and determine a message conversation history. The conversation history may include a measure of the length of the conversation between the sender of the message and each recipient of the message. For example, the length of a conversation can include a number of times the recipient responded to the sender. The length of the conversation can include a number of messages that were sent by the sender to the recipient which received responses. The length of the conversation can assist in the determination of whether the message and/or the sender are spam.
[0050] In some embodiments, one or more additional features can be determined for a message and used in a training phase and a classification/prediction phase. For example, another feature can be determined based on the subscription of the sender for the message service. For example, a weight factor can be determined for a sender based on the message subscription plan that the sender has with the message service. The weight factor can be determined based on the usage pattern of the sender when compared with the subscription plan. For example, when a consistent usage pattern is identified along with differing number of recipients, this may be considered as an indication that the sender is a spam user. In a non-limiting example, a user, which has subscribed for lOOsms/day subscription plan, and based on behavioral analysis it is identified that the user has been consistently sending lOOsms/day to different recipients for a certain period of time, there is a high probability that the messages are marketing spam messages and that the sender is a spam user.
[0051] The message feature extractor 140 further includes the content feature extractor 142, The content feature extractor 142 includes a multimedia feature extractor 143 and a text feature extractor 144. The multimedia feature extractor 143 is operative to receive multimedia messages and extract a set of one or more message content features for the multimedia message. The text feature extractor 144 is operative to receive a text message and extract a set of one or more message content feature for the text message. Several mechanisms can be used to determine text-related features and/or multimedia related features without departing from the scope of the present inventive concept. The multimedia feature extractor 143 and the text feature extractor 144 are optional elements, In some embodiments, at least one of the multimedia feature extractor 143 or the text feature extractor 144 is not included in the content feature extractor 142, For example, when the system is operative to offer an SMS service, the multimedia feature extractor may not be included. Other embodiments can be contemplated in which one or the other of the feature extractors is not present. In some embodiments, the two components 143 and 144 may be implemented as a single feature extractor that is adapted to extracts features from the message received regardless of the type of message.
[0052] In one embodiment, the multimedia feature extractor 143 may we use image feature extraction techniques to extract features from a multimedia message. For example, one or more types of objects present in an image can be extracted. A vectorized representation of the objects is used as a set of content features for the multimedia message.
[0053] In one embodiment, the text feature extractor 144 may use word-embeddings to extract the features. The word embeddings of each sentence in a message can be determined to capture the semantics of the message. For example, pre-trained word vectors can be used to extract features of text messages. In addition, features such as category of the message (e.g., sales, banking, clothing, etc.) can be extracted.
[0054] As mentioned above the set of features for a message can be used either during a training phase, in which features of multiple messages are used to form a data set for training the spam prediction model, or during a classification phase for spam detection of the message. In some embodiments, the features 111 can be used for classification of the message and during a training phase, in which the message 101A is used in combination with other messages for determining an update of the spam detection model.
[0055] Figure 3 illustrates a block diagram of exemplary spam feedback repository and message repository, in accordance with some embodiments. The spam feedback repository 170 includes a spam feedback history 171, a sender location determiner 173, a message flag determiner 175, a message counter 177, and a subscription determiner 179.
[0056] The user location registry 135 is a database that includes permanent subscriber information for the wireless communication network 100. The information includes the location of the user. For example, the user location registry 135 can be a home location register (HLR). The HLR is a central database that contains details of each wireless device user that is authorized to use the network 100. The HLRs store details of every SIM card issued by the mobile phone operator. Each SIM has a unique identifier called an IMSI which is the primary key to each HLR record. The database 135 can include a visitor location register (VLR). The VLR contains information about the users that subscribe to services of the network 100 and which are roaming within a mobile switching center’s (MSC) location area. [0057] The spam feedback repository 170 receives spam feedback 109 from one or multiple WDs, e.g., WDs 102B-N. The spam feedback includes an identifier of the message, and a spam variable(s). In some embodiments, the identifier of the message can include a unique identifier that identifies the message in the message repository 130. For example, each one of the messages 133 stored in the message repository 130, message 1-3 are associated with a respective message identifier 132. The message identifier can be generated at the time the message is first received at the message management center 110 and stored in the message repository 130. Message 1 illustrates an example message as stored in the message repository 130, the message 1 includes a message time stamp (that is the time at which the message was generated, or received), a sender identifier (sender ID), a receiver identifier (receiver ID), and a message content (which can include text and/or multimedia data).
[0058] The spam feedback 109 can also include a sender identifier, a receiver identifier, a time indicator of the message (e.g., the message timestamp). In other embodiments, the additional information, such as the sender’s identifier, the receiver’s identifier, the time indicator may not be included in the spam feedback 109 and can be determined from the message repository 130 based on the message identifier.
[0059] The spam variables include an indication of whether the message identified by the message identifier is spam. The spam variable may include one or more variables. A first variable can be used to determine whether the message includes spam content. A second variable can be used to determine whether the sender of the message is a spam user. In some embodiments, the spam variable can be determined based on one or multiple spam flags that were transmitted to the recipient for the message. In some embodiments, the spam variables are the spam flags and there is no additional parameter that is stored in the spam feedback history 171. In other embodiments, the spam variables are parameters that indicates whether the recipient agreed with the spam flags received or not. For example, the recipient may have received a notification indicating that the message is a spam message or alternatively that the sender of the message is a spam user, and the spam variable include a confirmation that the received spam flag is accurate or alternatively that the spam flag was erroneous. In other embodiments, the spam variable may include one or more spam flags that are set by the recipient without having received the notifications for the message. In other words, in these embodiments, even when no spam flag is transmitted to the recipient (e.g., either because the message was not classified or because the classification did not result in a spam flag being transmitted to the recipient of the message) the recipient can transmit one or more spam flags indicating whether the recipient considers the message as including spam content, the sender as being a spam user, or both.
[0060] Figure 3 illustrates an example in which the spam variables 176 indicate whether a spam flag received was accurate or not. For example, when a spam variable has a value of zero this is an indication that the spam flag was accurate, alternatively when a spam variable has a value of 1 this is an indication that the spam flag was erroneous· A spam flag can be erroneous if the recipient of the spam flag does not agree with the categorization of the message or the sender as spam. The spam flag is accurate, when the recipient agrees that the sender or the message are spam. In the illustrated example, the spam feedback history 171 includes for each message identifier 172, a respective spam flag 174 indicating one or more spam flags that were sent to the recipient for the message identified by the message identifier, a spam variable 176, which can include one or more values. The spam feedback history 171 may further include a sender’s location 178 and/or receiver subscription information 181.
In some embodiments, the spam feedback repository 170 can include a sender location determiner 173. The sender location determiner 173 is operative to access the user location registry 135 and obtain a location of the sender of the message. In some embodiments, the sender location determiner 173 is not present. The spam feedback repository 170 may also include a message flag determiner 175. The message flag determiner 175 enables other components to access the spam feedback repository 170 and determine the spam flags and/or the spam variables of one or more messages. The message counter 177 enables the determination of a number of messages for a given sender, receiver in a given interval of time. The subscription determiner 179 allows the determination of subscription information for the message. The subscription information can be information about the sender’s subscription or the receiver’s subscription. In some embodiments, the sender location determiner 173, the message flag determiner 175, the message counter 177, and the subscription determiner 179 are optional elements and one or more of these elements is not present in the spam feedback repository 170 in some embodiments. For example, one or more of these elements can be included in the message feature extractor 140.
[0061] Once the spam feedback 109 is stored in the spam feedback repository 170, the feedback is transmitted to the prediction model generator 160 to be used as a feature in the determination of a spam prediction model 152.
[0062] The message management center 110 includes a prediction model generator 160. The prediction model generator 160 is operative to perform a learning phase that enables generation of a spam prediction model. In some embodiments, the generation of the spam prediction model includes the determination of optimal weights for the spam prediction model. The optimal weights that are generated allow the determination for a given set of features of a message, of whether the content of the message includes spam content and whether the sender of the message is a spam user. Figure 4 illustrates a block diagram of an exemplary prediction model generator, in accordance with some embodiments.
[0063] The learning phase of a prediction model is performed based on a data set of inputs and expected outputs for the prediction model. For example, a prediction model can be trained based on messages and known spam labels for the messages. The spam labels can indicate whether the message includes spam content, whether the message has a sender that is a spam user, or whether the message both includes spam content and has a sender that is a spam user.
[0064] In some embodiments, there is no existing dataset that can be used for the generation of the spam prediction model 152. In these embodiments, in a first phase two separate prediction model determiner can be used to determine separate user and content prediction models. In a second step, based on the models generated for each one of the users and the message content, an aggregate model is generated.
[0065] Figure 4 includes a user prediction model determiner 161A and a content prediction model determiner 16 IB. The user prediction model determiner 161 A is operative to determine a user prediction model. The user prediction model can be used to determine whether the sender of a message is a spam user or not based on features of the message. The content prediction model determiner 16 IB is operative to determine a content prediction model. The content prediction model can be used to determine whether the message includes spam content or not based on features of the message.
[0066] The user prediction model determiner 161A includes a user data set generator 162A and a user prediction model generator 164 A. The user data set generator 162A is operative to generate a user data set 163 A that include for each set of features of a message a corresponding indication of spam for the sender of the message. In some embodiments, the user data set 163A is generated based on spam feedback received from the users to which messages are sent. In other embodiments, the user data set 163A can be generated based on external sources, which enable classification of senders. For example, the user data set generator 162A may have access to an external spam user identifier 169 (e.g., database including IDs of spam users). The user data set generator 162A may transmit a user ID to the external spam user identifier 169 and receive an indication of spam user or not indicating whether a particular message sender is a spam user.
[0067] Upon determination of the user data set 163 A, the data set can be used in the user prediction model generator 164 A to determine one or more user prediction models 165A-1 to 165A-N. Each one of the user prediction models 165A-1 to 165A-N is defined based on a set of user-based weights that are to be applied to message features to perform spam detection.
[0068] In some embodiments, the user prediction model generator 164A is operative to determine a user prediction model per location. The location can be the location of the sender or the location of the receiver. In some embodiments, for example depending on the scale of the location, the location of the sender and the location of the receiver can be different locations. In other embodiments, the location of the sender and the location of the receiver are the same, The user prediction model generated for the location can then be used for spam detection in all messages received from senders at that location. A location can be a city, a neighborhood, a street, or any other scale for performing spam detection of senders at that location based on a same model. In some embodiments, the user prediction model generator 164A generates multiple user prediction models, each model for a different location. In other embodiments, the user prediction model generator 164 A can generate a user prediction model for a given user. In these embodiments, the data set used for the generation of the user prediction model is built with data related to usage of a single receiver of messages.
[0069] The content prediction model determiner 16 IB includes a content data set generator 162B and a content prediction model generator 164B. The content data set generator 162B is operative to generate a content data set 163B that include for each set of features of a message a corresponding indication of spam for the content of the message. In some embodiments, the user data set 163B is generated based on spam feedback received from the users to which messages are sent. In other embodiments, the content data set 163B can be generated based on external sources, which enable classification of senders. For example, the content data set generator 162B may have access to an external spam content classifier (not illustrated). The content data set generator 162B may transmit a message to the external content classifier and receive an indication of spam content or not indicating whether a particular message includes spam or not.
[0070] Upon determination of the content data set 163B, the data set can be used in the content prediction model generator 164B to determine one or more content prediction models 165B-1 to 165B-N. Each one of the content prediction models 165B-1 to 165B-N is defined based on a set of content-based weights that are to be applied to message features to perform spam detection.
[0071] In some embodiments, the content prediction model generator 164B is operative to determine a content prediction model per location. The location can be the location of the sender or the location of the receiver. In some embodiments, for example depending on the scale of the location, the location of the sender and the location of the receiver can be different locations. In other embodiments, the location of the sender and the location of the receiver are the same. The content prediction model generated for the location can then be used for spam detection in all messages received from senders at that location. A location can be a city, a neighborhood, a street, or any other scale for performing spam detection of senders at that location based on a same model. In some embodiments, the content prediction model generator 164B generates multiple user prediction models, each model for a different location. In other embodiments, the content prediction model generator 164B can generate a content prediction model for a given user. In these embodiments, the data set used for the generation of the user prediction model is built with data related to usage of a single receiver of messages.
[0072] Once the user prediction models and the content prediction models are determined for the different locations, during the training phase, an aggregate prediction model is determined at the aggregate prediction model determiner 166. Thus, during the previously described training phase, two sets of weights are determined for each one of the features, the user-based weights and the content-based weight. The aggregate weights determiner 167 determines based on the user-based weights and the content-based weights a set of aggregate weights for the features. The aggregate weights are used for the definition of an aggregate prediction model. The aggregate prediction model, e.g., spam prediction model 152, is used to determine for a message, whether the sender of the message is a spam user and whether the message includes content spam. In other words the output of the spam prediction model can be a first indication that the message include spam content, a second indication that the sender of the message is a spam user or both.
[0073] The use of the aggregate prediction model, i.e., spam prediction model 152, significantly reduces the time taken for classifying incoming messages and thus makes it feasible to add to a real time solution for spam detection in a message management center 110. [0074] Several mechanisms can be used to determine the aggregate weights from the user- based weights and the content-based weights. For example, the aggregate weights can be determined based on the following equation:
[0075] AggregateWeight a(userWeight ) [3(content Weight ) (1)
[0076] where a & b are learnt by the system describing the contributing factor of each type of prediction to the aggregate spam output of a message.
[0077] The parameter a & b can be determined based on machine learning mechanisms such as least squares, random forest, neural networks, etc. The determination is done based on the spam feedback received from the receiver that indicate whether a message was from a spam user or its content was spam. In some embodiments, the operations described with reference to figure 4 are continuously performed such that the aggregate weights and consequently the aggregate prediction model (i.e., the spam prediction model) is continuously updated based on feedback received from the receivers of the messages,
[0078] The proposed method uses both combinations of the user-based features (sender/receiver identifications, respective location, etc,) as well as the contextual features (industrial/educational, sender location type - derived from additional data sources, sender subscription type, sender VAS subscription parameters, message content, message length/size, number of conversation exchanges, and time of the message, etc.) to build a spam prediction model and perform spam detection of messages in a messaging service of a communication network.
[0079] The operations in the flow diagrams will be described with reference to the exemplary embodiments of Figures 1-4. However, it should be understood that the operations of the flow diagrams can be performed by embodiments of the invention other than those discussed with reference to Figures 1-4, and the embodiments of the invention discussed with reference to Figures 1-4 can perform operations different than those discussed with reference to the flow diagrams.
[0080] Figure 5 illustrates a flow diagram of exemplary operators for message spam detection in accordance with some embodiments.
[0081] The message management center 110 receives, at operation 502, from a first device of a first sender, a message addressed to one or more receivers. The message management center 110 determines, at operation 504, a set of features for the message. The set of features includes at least one or more content features that relates to content of the message, and one or more user features that relate to the first sender. The message management center 110 determines 506, based on a spam prediction model and the set of features for the message, whether the first sender is a spam sender and whether the message includes spam content. Responsive to determining that at least one of the first sender is a spam sender and the message includes spam content, the message management center 110 transmits, at operation 508, a notification, e.g., notifications 103A-K, to one or more wireless devices of the one or more receivers including the at least one of a user flag and a message flag. The user flag indicates that the first sender is a spam user and the message flag indicates that the message includes spam content. The notifications 103A-K include one or more flags that indicates whether a sender of a message 105A-L is a spam user or alternatively that indicates whether the message includes spam content. In some embodiments, the notifications, e.g.. notifications 103A-K, may include (operation 510) the message itself in addition to the spam flags. In other embodiments (operation 511), the notifications 103A-K may include only the flag(s), or the flag(s) and one or more additional information related to the message or the sender that caused the generation of the spam flag. For example, the additional information can include an identifier of the sender.
[0082] Alternatively, responsive to determining that the first sender is not a spam sender and the message does not include spam content are true, the message management center 110 transmits, at operation 512, the message to the one or more devices of the one or more receivers.
[0083] Figure 6 illustrates a flow diagram of exemplary notifications that can be sent to a user following detection of message spam in accordance with some embodiments. The notifications are determined for a given message based on a spam prediction performed for the message based on content-based features and user-based features.
[0084] In some embodiments, the notifications includes, operation 602, the user flag only when the user flag indicates that the first sender is a spam sender and the message flag does not indicate that the message includes spam content.
[0085] In other embodiments, the notifications include, operation 604, the message flag only when the message flag indicates that the message includes spam content and the user flag does not indicate that the first sender is a spam user.
[0086] In other embodiments, the notifications include, operation 606, the message flag and the user flag when the message flag indicates that the message includes spam content and the user flag indicates that the first sender is a spam user.
[0087] In some embodiments, the notifications may further include a single spam flag. In these embodiments, the spam flag may result from the determination that the message includes spam content or that the sender of a message is a spam user. The spam flag may not indicate, which of spam content or a spam user causes the generation of the spam flag. However, the receiver of the notification, upon receipt of the spam flag may request to obtain more information. In this embodiment, the WD receiving the notification, e.g., WD 102B, may transmit a request to the message management center 110 for additional information regarding the message that resulted in the transmission of the notification. For example, the request may include the message identifier. Based on the message identifier, the message management center 110 is operative to retrieve the more detailed spam flags, a first flag indicating that the sender is a spam user and/or a second flag indicating that the message includes spam content. The message management center 110 may then transmit to the WD a second notification including the more detailed spam flags.
[0088] Figure 7 illustrates a flow diagram of exemplary operations for receiving spam feedback from a device e.g. WD 102A, in accordance with some embodiments. Upon receipt of notifications from the message management center 110, the WD is operative to transmit feedback to the message management center 110. For example, a messaging application included in the WD includes a user interface with graphical elements enabling a user of the WD to select one or more graphical elements for transmitting feedback related to the spam notifications received. As a non-limiting example, a graphical element can indicate that the user agrees with the spam flag received. In another non-limiting example, another graphical element can indicate that the user does not agree with the spam flags received. Different types of graphical elements and graphical user interfaces can be used for enabling the user of the WD to transmit feedback to the message management center 110. The messaging application may further include enhanced functionality in the messaging transmission protocol for transmitting the spam feedback to the messaging management center 110. For example, the WD 102B may report spam feedback using enhanced to SMS- COMMAND or a new SMS-REPORT.
[0089] In some embodiments, when the message management center 110 has transmitted a spam flag to a set of WDs, the operations of Figure 7 may be performed. The message management center 110, e.g., message gateway 120, receives, at operation 702, from a second device of the wireless devices, e.g., WD 102B, spam feedback that indicates whether a second user of the second wireless device agrees with the user flag and the message flag. As discussed above with reference to Figures 104, the spam feedback can include spam variables. In some embodiments, the spam feedback includes, at operation 712, a first indication that the second user agrees that the sender is a spam user. In another embodiment, the spam feedback includes, at operation 714, a second indication that the second user agrees that the message includes spam content. In other embodiments, the spam feedback includes, at operation 716, a first indication that the second user agrees that the sender is a spam user and a second indication that the second user agrees that the message includes spam content,
[0090] At operation 704, the message management center 110 uses the spam feedback as a feature for training the spam prediction model to obtain an updated spam prediction model that is to be used for determining, for new messages received, whether senders and message contents are respectively spam users or include spam content. The spam feedback can be used to generate new data sets and the new data sets are used during a new training phase. The result of the new training phase can be an updated prediction model that is adapted to feedback received from the user. This may result in messages determined as spam messages according to a first prediction model to no longer be determined as spam according to the updated model or vice versa. Further, this may result in senders determined as spam users according to the first prediction model that are no longer determined to be spam users according to the updated prediction model or alternatively senders that were not determined to be spam users according to the first prediction model that are determined to be spam users according to the updated prediction model. Thus, the embodiments presented herein allow for more flexibility in spam detection such as a user may adapt their behavior depending on messages that they want to receive. The system automatically determines new prediction models based on the feedback received.
[0091] In some embodiments, when the receiver of the message transmits spam feedback that indicates that the sender is not a spam user, the message management center 110 may transmit the messages that caused the transmission of the user spam notification to the receiver. In these embodiments, upon receipt of the spam feedback, the message management center 110 may determine based on a message identifier included in the spam feedback, the message that caused the transmission of the spam flag to the WD, retrieve the message from the message repository 130 and transmit the message to the WD. In some embodiments, when transmitting the spam feedback that the sender is not a spam user, the receiver may further include a request to retrieve the message that caused the transmission of the user spam notification to the receiver. The request can be PULL/RETRIEVE command identifying the message to be retrieved or identifying the sender of the message, or both.
[0092] While Figure 7 illustrates exemplary operations performed by the message management center 110 upon receipt of spam feedback from a WD that has received one or multiple spam flags from the message management center 110, Figure 8 illustrates a flow diagram of exemplary operations for receiving spam feedback from another device, when this other device has not received any spam flag for a given message. The operations of Figure 8 may be performed following the transmission of a message to a device. Upon receipt of the message from the message management center 110, the WD tran mits feedback to the message management center 110. For example, a messaging application included in the WD includes a user interface with graphical elements enabling a user of the WD to select one or more graphical elements for transmitting feedback related to the spam notifications received. As a non-limiting example, a graphical element can indicate that the message is spam. In another non-limiting example, another graphical element can indicate that the sender of the message is a spam user. In another non-limiting example, another graphical element can indicate that the content of the message is spam. Different types of graphical elements and graphical user interfaces can be used for enabling the user of the WD to transmit feedback to the message management center 110. The messaging application may further include enhanced functionality in the messaging transmission protocol for transmitting the spam feedback to the messaging management center 110. For example, the WD 102B may report spam feedback using enhanced to SMS-COMMAND or a new SMS- REPORT.
[0093] At operation 802, the message management center 110, receives from a third device, second spam feedback that indicates at least one of whether the sender of a message received by the device is a spam user and whether the message received at the third device includes spam content. In some embodiments, the spam feedback includes, at operation 812, a third indication that the second sender is a spam user. In another embodiment, the spam feedback includes, at operation 814, a fourth indication that the second message includes spam content. In other embodiments, the spam feedback includes, at operation 816, the third indication that the second sender is a spam user and the fourth indication that the second message includes spam content.
[0094] At operation 804, the message management center 110 uses the spam feedback as a feature for training the spam prediction model to obtain an updated spam prediction model that is to be used for determining, for new messages received, whether senders and message contents are respectively spam users or include spam content. The spam feedback can be used to generate new data sets and the new data sets are used during a new training phase. The result of the new training phase can be an update prediction model that is adapted to feedback received from the user. This may result in messages determined as spam messages according to a first prediction model to no longer be determined as spam according to the updated model or vice versa. Further, this may result in senders determined as spam users according to the first prediction model that are no longer determined to be spam users according to the updated prediction model or alternatively senders that were not determined to be spam users according to the first prediction model that are determined to be spam users according to the updated prediction model. Thus, the embodiments presented herein allow for more flexibility in spam detection such as a user may adapt their behavior depending on messages that they want to receive. The system automatically determine new prediction models based on the feedback received.
[0095] In some embodiments, when the receiver of the message transmits spam feedback that indicates that the sender is not a spam user, the message management center 110 may transmit the messages that caused the transmission of the user spam notification to the receiver. In these embodiments, upon receipt of the spam feedback, the message management center 110 may determine based on a message identifier included in the spam feedback, the message that caused the transmission of the spam flag to the WD, retrieve the message from the message repository 130 and transmit the message to the WD.
[0096] Architecture:
[0097] An electronic device stores and transmits (internally and/or with other electronic devices over a network) code (which is composed of software instructions and which is sometimes referred to as computer program code or a computer program) and/or data using machine-readable media (also called computer-readable media), such as machine-readable storage media (e.g., magnetic disks, optical disks, solid state drives, read only memory (ROM), flash memory devices, phase change memory) and machine-readable transmission media (also called a carrier) (e.g., electrical, optical, radio, acoustical or other form of propagated signals - such as carrier waves, infrared signals). Thus, an electronic device (e.g., a computer) includes hardware and software, such as a set of one or more processors (e.g., wherein a processor is a microprocessor, controller, microcontroller, central processing unit, digital signal processor, application specific integrated circuit, field programmable gate array, other electronic circuitry, a combination of one or more of the preceding) coupled to one or more machine-readable storage media to store code for execution on the set of processors and/or to store data. For instance, an electronic device may include non-volatile memory containing the code since the non-volatile memory can persist code/data even when the electronic device is turned off (when power is removed), and while the electronic device is turned on that part of the code that is to be executed by the processor(s) of that electronic device is typically copied from the slower non-volatile memory into volatile memory (e.g., dynamic random access memory (DRAM), static random access memory (SRAM)) of that electronic device. Typical electronic devices also include a set or one or more physical network interface(s) (NI(s)) to establish network connections (to transmit and/or receive code and/or data using propagating signals) with other electronic devices. For example, the set of physical NIs (or the set of physical NI(s) in combination with the set of processors executing code) may perform any formatting, coding, or translating to allow the electronic device to send and receive data whether over a wired and/or a wireless connection. In some embodiments, a physical NI may comprise radio circuitry capable of receiving data from other electronic devices over a wireless connection and/or sending data out to other devices via a wireless connection. This radio circuitry may include transmitter(s), receiver(s), and/or transceiver(s) suitable for radiofrequency communication. The radio circuitry may convert digital data into a radio signal having the appropriate parameters (e.g., frequency, timing, channel, bandwidth, etc.). The radio signal may then be transmitted via antennas to the appropriate reeipient(s). In some embodiments, the set of physical NI(s) may comprise network interface controller(s) (NICs). also known as a network interface card, network adapter, or local area network (LAN) adapter. The NIC(s) may facilitate in connecting the electronic device to other electronic devices allowing them to communicate via wire through plugging in a cable to a physical port connected to a NIC. One or more parts of an embodiment of the inventive concept may be implemented using different combinations of software, firmware, and/or hardware.
[0098] As used herein, wireless device (WD) refers to a device capable, configured, arranged and/or operable to communicate wirelessly with network nodes and/or other wireless devices. Communicating wirelessly may involve transmitting and/or receiving wireless signals using electromagnetic waves, radio waves, infrared waves, and/or other types of signals suitable for conveying information through air. In some embodiments, a WD may be configured to transmit and/or receive information without direct human interaction. For instance, a WD may be designed to transmit information to a network on a predetermined schedule, when triggered by an internal or external event, or in response to requests from the network. Examples of a WD include, but are not limited to, a smart phone, a mobile phone, a cell phone, a voice over IP (VoIP) phone, a wireless local loop phone, a desktop computer, a personal digital assistant (PDA), a wireless cameras, a gaming console or device, a music storage device, a playback appliance, a wearable terminal device, a wireless endpoint, a mobile station, a tablet, a laptop, a laptop -embedded equipment (LEE), a laptop-mounted equipment (LME), a smart device, a wireless customer-premise equipment (CPE) a vehicle-mounted wireless terminal device, etc. A WD may support device-to-device (D2D) communication, for example by implementing a 3GPP standard for sidelink communication, vehicle-to-vehicle (V2V), vehicle-to-infrastructure (V2I), vehicle- to-everything (V2X) and may in this case be referred to as a D2D communication device. As yet another specific example, in an Internet of Things (loT) scenario, a WD may represent a machine or other device that performs monitoring and/or measurements, and transmits the results of such monitoring and/or measurements to another WD and/or a network node. The WD may in this case be a machine-to-machine (M2M) device, which may in a 3GPP context be referred to as an MTC device. As one particular example, the WD may be a UE implementing the 3GPP narrow band internet of things (NB-IoT) standard. Particular examples of such machines or devices are sensors, metering devices such as power meters, industrial machinery, or home or personal appliances (e.g. refrigerators, televisions, etc.) personal wearables (e.g., watches, fitness trackers, etc,). In other scenarios, a WD may represent a vehicle or other equipment that is capable of monitoring and/or reporting on its operational status or other functions associated with its operation. A WD as described above may represent the endpoint of a wireless connection, in which case the device may be referred to as a wireless terminal. Furthermore, a WD as described above may be mobile, in which case it may also be referred to as a mobile device or a mobile terminal. For example, each one of the WD 102A-N can be a wireless device. In other embodiments, each one of the WD 102A-N can be a network device that is communicatively coupled with the message management center via a wired connection as opposed to a wireless connection. One of ordinary skill in the art may understand that the WD 102A-N may be coupled with the message management center 110 through one or more additional network devices that are not illustrated.
[0099] A network device (ND) is an electronic device that communicatively interconnects other electronic devices on the network (e.g., other network devices, end-user devices). Some network devices are“multiple services network devices” that provide support for multiple networking functions (e.g., routing, bridging, switching, Layer 2 aggregation, session border control, Quality of Service, and/or subscriber management), and/or provide support for multiple application services (e.g,, data, voice, and video, etc.). In the embodiments described above, each one of the message gateway 120, the message repository 130, the message feature extractor 140, the message spam manager 150, and the prediction model generator 160, and the spam feedback repository 170 can be implemented on one or more network devices. In some embodiments, at least two of the message gateway 120, the message repository 130, the message feature extractor 140, the message spam manager 150, and the prediction model generator 160, and the spam feedback repository 170 can be implemented on the same network device. Alternatively, each one of the message gateway 120, the message repository 130, the message feature extractor 140, the message spam manager 150, and the prediction model generator 160, and the spam feedback repository 170 is implemented on a separate one of multiple network devices.
[00100] Figure 9 illustrates a block diagram for a network device that can be used for implementing one or more components of the message management center described herein, in accordance with some embodiments. The network device 930 may be a web or cloud server, or a cluster of servers, running on server hardware. According to one embodiment, the network device is a server device which includes hardware 905. Hardware 905 includes one or more processors 914, network communication interfaces 960 coupled with a computer readable storage medium 912. The computer readable storage medium 912 may include the message feature extractor code 932, the message spam manager code 933, and the prediction model generator code 936, and the spam feedback repository 934.
[00101] Alternative embodiments of network device 930 may include additional components beyond those shown in Figure 9 that may be responsible for providing certain aspects of the message management center functionality, including any of the functionality described herein and/or any functionality necessary to support the subject matter described herein. For example, network device 930 may include user interface equipment to allow input of information into network device 930 and to allow output of information from network device 930. This may allow a user to perform diagnostic, maintenance, repair, and other administrative functions for network device 930,
[00102] While one embodiment does not implement virtualization, alternative embodiments may use different forms of virtualization - represented by a virtualization layer 920. In these embodiments, the instance 940 and the hardware that executes it form a virtual server which is a software instance of the modules stored on the computer readable storage medium 912.
[00103] The message feature extractor code 932, the message spam manager code 933, and the prediction model generator code 936 includes instructions which when executed by the hardware 905 causes the instance 940 to implement message feature extractor 952, a message spam manager 953, and prediction model generator 956 that is operative to perform the operations performed by the components of the message management center 110 described with reference to Figures 1-8.
[00104] Figure 10 illustrates a block diagram for a device that can be used for implementing one or more devices described herein, in accordance with some embodiments. The device 1030 is an electronic device that is adapted to communicate with a message management center and transmit and receive messages through a messaging service. According to one embodiment, the device 1030 includes hardware 1005. Hardware 1005 includes one or more processors 1014, network communication interfaces 1060 coupled with a computer readable storage medium 1012. The computer readable storage medium 1012 may include a messaging application code 1035 that includes spam feedback user interface code 1032, and spam feedback transmission protocol code 1033.
[00105] Alternative embodiments of device 1030 may include additional components beyond those shown in Figure 10 that may be responsible for providing certain aspects of the message management center functionality, including any of the functionality described herein and/or any functionality necessary to support the subject matter described herein. For example, device 1030 may include user interface equipment to allow input of information into device 1030 and to allow output of information from network device 930.
[00106] While one embodiment does not implement virtualization, alternative embodiments may use different forms of virtualization - represented by a virtualization layer 1020. In these embodiments, the instance 1040 and the hardware that executes it form a virtual server which is a software instance of the modules stored on the computer readable storage medium 1012.
[00107] The messaging application code 1035 that includes spam feedback user interface code 1032, and spam feedback transmission protocol code 1033 includes instructions which when executed by the hardware 1005 causes the instance 1040 to implement the messaging application 1055 that includes spam feedback user interface 1052, and spam feedback transmission protocol 1053 that is operative to perform the operations performed by the wireless devices described with reference to Figures 1-8.
[00108] While the flow diagrams in the figures show a particular order of operations performed by certain embodiments of the inventive concept, it should be understood that such order is exemplary (e.g., alternative embodiments may perform the operations in a different order, combine certain operations, overlap certain operations, etc.).
[00109] While the inventive concept has been described in terms of several embodiments, those skilled in the art will recognize that the inventive concept is not limited to the embodiments described, can be practiced with modification and alteration within the spirit and scope of the appended claims. The description is thus to be regarded as illustrative instead of limiting.

Claims

CLAIMS:
1. A method performed by a network node for spam detection in a communication network, the method comprising:
receiving (502), from a first device of a first sender, a message addressed to one or more receivers;
determining (504) a set of features for the message, wherein the set of features includes at least one or more content features that relates to content of the message, and one or more user features that relate to the first sender;
determining (506), based on a spam prediction model and the set of features for the message, whether the first sender is a spam sender and whether the message includes spam content;
responsive to determining that at least one of the first sender is a spam sender and the message includes spam content, transmitting (508) a notification to one or more devices of the one or more receivers including the at least one of a user flag and a message flag, wherein the user flag indicates that the first sender is a spam user and the message flag indicates that the message includes spam content; and
responsive to determining that the first sender is not a spam sender and the message does not include spam content, transmitting (512) the message to the one or more devices of the one or more receivers.
2. The method of claim 1, wherein the notification includes (510) the message.
3. The method of any of claims 1-2, wherein the at least one of a user flag and a message flag includes (602) the user flag only when the user flag indicates that the first sender is a spam sender and the message flag does not indicate that the message includes spam content.
4. The method of any of claims 1-2, wherein at least one of a user flag and a message flag includes (604) the message flag only when the message flag indicates that the message includes spam content and the user flag does not indicate that the first sender is a spam user.
5. The method of any of claims 1-2, wherein the at least one of a user flag and a message flag include (606) the message flag and the user flag when the message flag indicates that the message includes spam content and the user Hag indicates that the first sender is a spam user.
6. The method of any of claims 1-5, further comprising:
receiving (702), from a second device of the one or more devices, spam feedback that indicates whether a second user of the second device agrees with the at least one of the user flag and the message flag; and
using (704) the spam feedback as a feature for training the spam prediction model to obtain an updated spam prediction model that is to be used for determining, for new messages received, whether senders and message contents are respectively spam users or include spam content.
7. The method of claim 6, wherein the spam feedback includes at least one of a first indication that the second user agrees that the first sender is a spam user and a second indication that the second user agrees that the message includes spam content.
8. The method of any of claims 1-7, further comprising:
receiving (802), from a third device, second spam feedback that indicates at least one of whether a second sender of a second message is a spam user and whether the second message received at the third device includes spam content; and using (804) the second spam feedback as a feature for training the spam prediction model to obtain an updated spam prediction model that is to be used for determining, for new messages received, whether senders and message contents are respectively spam users or include spam content.
9. The method of claim 8, wherein the second spam feedback includes at least one of a third indication that the second sender is a spam user and a fourth indication that the second message includes spam content.
10. The method of any of claims 1-8, wherein the message includes one or more of text, image, video, and a hyperlink.
11. The method of any of claims 1-10, wherein the one or more devices of the one or more receivers are located at a same location and the spam prediction model is determined for the location.
12. A machine-readable medium comprising computer program code which when executed by a computer carries out the method steps of any of claims 1-11.
13. A network node for spam detection in a communication network, the network node including:
one or more processors; and
non-transitory computer readable storage media that stores instructions, which when executed by the one or more processors cause the network node to:
receive (502), from a first device of a first sender, a message addressed to one or more receivers,
determine (504) a set of features for the message, wherein the set of features includes at least one or more content features that relates to content of the message, and one or more user features that relate to the first sender,
determine (506), based on a spam prediction model and the set of features for the message, whether the first sender is a spam sender and whether the message includes spam content,
responsive to determining that at least one of the first sender is a spam sender and the message includes spam content, transmit (508) a notification to one or more wireless devices of the one or more receivers including the at least one of a user flag and a message flag, wherein the user flag indicates that the first sender is a spam user and the message flag indicates that the message includes spam content, and responsive to determining that the first sender is not a spam sender and the message does not include spam content, transmit (512) the message to the one or more devices of the one or more receivers,
14. The network node of claim 13, wherein the notification includes (510) the message.
15. The network node of any of claims 13-14, wherein the at least one of a user flag and a message flag includes (602) the user flag only when the user flag indicates that the first sender is a spam sender and the message flag does not indicate that the message includes spam content.
16. The network node of any of claims 13-14, wherein at least one of a user flag and a message flag includes (604) the message flag only when the message flag indicates that the message includes spam content and the user flag does not indicate that the first sender is a spam user,
17. The network node of any of claims 13-14, wherein the at least one of a user flag and a message flag include (606) the message flag and the user flag when the message flag indicates that the message includes spam content and the user flag indicates that the first sender is a spam user.
18. The network node of any of clai s 13-17, wherein the instructions when executed by the one or more processors are further to cause the network node to:
receive (702), from a second wireless device of the one or more devices, spam feedback that indicates whether a second user of the second device agrees with the at least one of the user flag and the message flag; and use (704) the spam feedback as a feature for training the spam prediction model to obtain an updated spam prediction model that is to be used for determining, for new messages received, whether senders and message contents are respectively spam users or include spam content.
19. The network node of claim 18, wherein the spam feedback includes at least one of a first indication that the second user agrees that the first sender is a spam user and a second indication that the second user agrees that the message includes spam content.
20. The network node of any of claims 13-19, wherein the instructions when executed by the one or more processors are further to cause the network node to:
receive (802), from a third device, second spam feedback that indicates at least one of whether a second sender of a second message is a spam user and whether the second message received at the third device includes spam content; and use (804) the second spam feedback as a feature for training the spam prediction model to obtain an updated spam prediction model that is to be used for determining, for new messages received, whether senders and message contents are respectively spam users or include spam content.
21. The network node of claim 20, wherein the second spam feedback includes at least one of a third indication that the second sender is a spam user and a fourth indication that the second message includes spam content.
22. The network node of any of claims 13-21, wherein the message includes one or more of text, image, video, and a hyperlink.
23. The network node of any of claims 13-22, wherein the one or more devices of the one or more receivers are located at a same location and the spam prediction model is determined for the location.
PCT/IN2019/050253 2019-03-28 2019-03-28 Method and system for message spam detection in communication networks WO2020194323A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/IN2019/050253 WO2020194323A1 (en) 2019-03-28 2019-03-28 Method and system for message spam detection in communication networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IN2019/050253 WO2020194323A1 (en) 2019-03-28 2019-03-28 Method and system for message spam detection in communication networks

Publications (1)

Publication Number Publication Date
WO2020194323A1 true WO2020194323A1 (en) 2020-10-01

Family

ID=72611688

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2019/050253 WO2020194323A1 (en) 2019-03-28 2019-03-28 Method and system for message spam detection in communication networks

Country Status (1)

Country Link
WO (1) WO2020194323A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080140781A1 (en) * 2006-12-06 2008-06-12 Microsoft Corporation Spam filtration utilizing sender activity data

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080140781A1 (en) * 2006-12-06 2008-06-12 Microsoft Corporation Spam filtration utilizing sender activity data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KARAMI , AMIR ET AL.: "IMPROVING STATIC SMS SPAM DETECTION BY USING NEW CONTENT-BASED FEATURES", 20 TH AMERICA S CONFERENCE ON INFORMATION SYSTEMS, AMCIS 2014, 31 August 2014 (2014-08-31), pages 1 - 9, XP055744988 *

Similar Documents

Publication Publication Date Title
EP3711282B1 (en) Method and apparatus for subscription update
US10932160B2 (en) Adaptive traffic processing in communications network
EP2789182B1 (en) Methods and apparatus to trigger firmware update request in response to a failure event
US20130013555A1 (en) Machine to Machine (M2M) Application Server, XDMS server, and Methods for M2M Applications Group Management
US8768310B1 (en) Providing a notification message
CN104641599B (en) Method and system for delayed notification in a communication network
EP2334035B1 (en) Managing presence information in a communications system
CN103688558B (en) Interface between 3gpp networks and 3gpp2 networks for wap text messaging
US20220225149A1 (en) Network API Capability Reporting Method, Apparatus, and System
US11451953B2 (en) Methods, network function nodes and computer readable media for event subscription management
CN116134850A (en) NF discovery and selection based on service response delay measurement
US20220006816A1 (en) Terminal management and control method, apparatus, and system
CN115918158A (en) Method and apparatus for location services
EP2842356B1 (en) Updating subscription information
CN109661008B (en) High-efficiency data acquisition method for cloud data center and computer-readable storage medium
WO2020194323A1 (en) Method and system for message spam detection in communication networks
CN113424560B (en) Method and apparatus for group content delivery
US20140323145A1 (en) Base station paging based on traffic content type
TW201517653A (en) Method of handling rejections of SMS messages and related communication system
CN103391519A (en) Short message processing method and short message processing device
WO2023274366A1 (en) Method and apparatus for setting up session with required quality of service
WO2023221604A1 (en) Communication method and apparatus
US10084926B1 (en) Grafting and separation of mobile telephone number lines
US20130210472A1 (en) System for providing a graphical user interface on a mobile device
US20220353668A1 (en) Methods, network function nodes and computer readable media for contents communication management

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19922094

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19922094

Country of ref document: EP

Kind code of ref document: A1