WEB ROBOT DETECTION

The traffic produced by the periodic crawling activities of Web robots nowadays represents a good fraction of the overall traffic of the websites, thus causing some non-negligible effects on their performance. To cope with the presence of Web robots, it is therefore important to understand and predict their behavior. The traffic generated on different Web sites is analyzed in order to characterize the behavior of popular search engines, as well as of malicious robots.

The Big Data stored into Web logs or captured by sniffing network traffic are analyzed with the objective of identifying models able to summarize the behavior of the overall Web traffic as well as the traffic of individual robots. These models can be used for forecasting and Web robot detection.