Malicious traffic detection using traffic fingerprints and machine learning

Over the past year, we’ve worked on a machine learning project at Ben Gurion University of the Negev.
The project attempted to find out if we can identify malicious underlying traffic (viruses, botnets, command and control channels) hiding interspersed in ‘normal’ network traffic, without using advanced heuristics or deep packet inspection – but by using the statistical breakdown of the packets and supervised machine learning algorithms, as well as clustering.

By using inter-arrival and departure times of the packets seen on a network connection in conjunction with the Lempel Ziv 78 (LZ78) compression algorithm to assign probabilities, we arrived at some interesting results.

This means that even malware which transports data through TLS encrypted flow can be identified, without decrypting the data first.

The article was originally to be published in November 2014, but we missed several deadlines. Instead of having it be buried in my files, I’ve attached the article for any and all interested.

The research paper

Malicious traffic detection using traffic fingerprint (PDF, 6MB)

Our Python3 source code

https://github.com/arnons1/trafficfingerprint

Posted

19/01/2015

Network Security

Arnon Shimoni

Tags:

algorithm, ben gurion, bgu, botnet, capture, cryptolocker, fingerprint, machine learning, malware, ml, network, python, security, supervised learning, university, virus

Comments

2 responses to “Malicious traffic detection using traffic fingerprints and machine learning”

sam ed

03/03/2015

Really interesting. thank you. what traffic fingerprinting application did you use ?

Reply
1. arnon.shimoni@gmail.com
  
  04/03/2015
  
  We did our own fingerprinting, with Wireshark pcap files.
  The algorithm is explained in the document.
  
  Reply

Malicious traffic detection using traffic fingerprints and machine learning

The research paper

Our Python3 source code

Comments

2 responses to “Malicious traffic detection using traffic fingerprints and machine learning”

Leave a ReplyCancel reply