CHARACTERIZATION OF SPAM TRAFFIC
Spam Museum, Austin - MN
Spam Museum, Austin - MN

Unsolicited Bulk Email and Unsolicited Commercial Email, known as, spam, represents a very large percentage of the mail messages flowing over the Internet very day. Spammers take advantage of the near-zero cost of sending email to flood the network, knowing that success even very few times means a profit. The cost of spam worldwide was estimated in 39 billion Euro in 2005. Solutions exist to reduce the amount of spam seen by end users, but cannot withstand sophisticated attacks. Moreover, these solutions occasionally misclassify and silently drop legitimate email.

The goal of this project is to characterize the spam traffic and identify its patterns and the factors influencing its distribution. Our studies rely on measurements collected on the mail servers of the Computing Center of the University of Pavia that provide email services to more than 5,000 users. The measurements include the standard log files produced by Postfix as well as the logs produced by the Sophos PureMessage anti-spam solution running on the servers. Moreover, the project analyzes the message headers and bodies with the objective of identifying the "tricks" used by spammers to fool the filters and studying their evolution.