Spam detection with Machine Learning

Can predictive models be used in classifying emails?

More than half of the global population are active email users says Radicati in a 2019 report. Most office workers check their emails all the time scared to miss an important message. But in reality only 42% of emails are important or relevant. Often, the problem comes from how emails are managed. With an average backlog of 200 emails and not processed, it becomes difficult to identify important messages, or even urgent ones.

An employee checks emails in average 15 times a day. Each time with the objective, in the flow of incoming emails, of finding what is important, what is urgent, what is only for information and spams. Machine learning can help sort out incoming email by categories, i.e. the 42% of important messages therefore saving precious time. Avoiding spam alone can spare approximately $1934 per capita yearly.

Problems to solve

  • Can predictive models be used in classifying automatically emails?
  • Is it possible to detect spam?
  • Is it possible to identify important and urgent messages through the use of machine learning?
  • Benefits of TADA
    in the Marketing Industry

    Office workers (for their own use), IT managers (to support IT infrastructure users) and Internet Service Providers (ISP) handle an important volume of emails. However, most of them are not data scientists and don’t have Machine Learning skills or software coding knowledge to create predictive models. There are 269 billion emails sent each day worldwide, which can sound like Big Data, An average email user receives daily about 120 emails. This is not Big Data but rather Small Data. Small Data sets are not well handled by traditional Machine Learning tools.

    MyDataModels offers TADA, a solution designed for Small Data to help office workers and IT managers build predictive models out of their Small Data sets. Professionals don’t need to get trained in data science to use TADA. They can use their own Small Data sets without data pre-processing or normalization.

    MyDataModels offers TADA, a self-service solution for office workers, IT managers and ISP with no data science knowledge. Results are obtained in minutes on a standard laptop.

    TADA brings new possibilities for email classification

    86% of professionals name email as their preferred means of business communication. Email ranked as the third most reliable source of information for B2B audiences – topped only by colleague recommendations and industry-specific influencers.

    However two-and-a-half hours used daily reading and replying to emails, is a lot of time, and a lot of it unproductive. The use of machine learning to more efficiently to sort and classify emails is a clear productivity gain for professionals. The time previously used for sorting and classifying can be reinvested on focusing on more important tasks. 

    “Predictive model allows the improvement of spam detection”

    TADA can be used by an email client provider in order to implement such features as sorting a mailbox, improving spam detection and limiting fraud, phishing or computer virus’s proliferation. Office workers, IT managers and ISP can use TADA predictive models to decide efficiently and quickly which emails must be treated in priority and which ones should be discarded (spam).

    MyDataModels brings a self-service solution for those who have Small Data and no data scientists.