Spam Detection by Combining Bayesian Method and Regression Analysis

Srikanth, K. (2024) Spam Detection by Combining Bayesian Method and Regression Analysis. In: Research Updates in Mathematics and Computer Science Vol. 3. B P International, pp. 55-69. ISBN 978-81-971665-3-2

Full text not available from this repository.

Abstract

This study proposes a new method that utilizes the correlation structure between the number of words in the mail and the Bayesian score. Spam mails usually do not have a stable style and features. Spammers who send such mails, go on changing the features. The most often used statistical filter for email filtering is the Naive Bayesian filter. However, the training data and the word corpus that the filter designer utilized will determine the filter's design. A new mail with unknown nature is classified into spam (unsolicited mail) or ham (legitimate mail) basing on a score by combining conditional probabilities of tokens in the mail. The statistical behavior of this score indicates some interesting features, which can be explored to improve performance of the filter. We report the results of an experiment using Enron data set and highlight the advantages of the new filter. We also propose a new method of testing the model using random data sets.

Item Type: Book Section
Subjects: Pustakas > Mathematical Science
Depositing User: Unnamed user with email support@pustakas.com
Date Deposited: 08 Apr 2024 08:24
Last Modified: 08 Apr 2024 08:24
URI: http://archive.pcbmb.org/id/eprint/1944

Actions (login required)

View Item
View Item