What is deep packet inspection?
Digging into DPI
Data is the lifeblood of the digital economy, and marketing companies are constantly looking for ways to squeeze just a little bit more out of the average consumer. Traditional tracking methods like marketing cookies give plenty of insights into your browsing data - but that’s nothing compared to the scrutiny of your internet traffic enabled by Deep Packet Inspection (DPI).
DPI is a sophisticated method of inspecting internet traffic that has many legitimate applications for improving network security and efficiency. However, it can also be abused by marketing agencies to spy on you and weaponized by repressive governments for censorship, surveillance, and blocking access to tools like VPNs. However, some of the best VPNs offer obfuscation technologies that help you get around DPI-based protocol blocking.
Understanding how DPI works and the risks it poses is essential for anyone concerned about online privacy and freedom. So, read on, and I’ll explore DPI in detail, going over its functions, uses, and implications for VPN users.
Deep Packet Inspection
DPI stands for Deep Packet Inspection, which is a method of analyzing network traffic at a granular level by analyzing the content of individual packets.
While more taxing on networking devices than regular packet analysis, DPI allows network admins to categorize traffic in complex ways, according to their contents, that would otherwise be impossible using traditional data packet analysis.
A data packet is the fundamental unit of data sent over a network. To enable communication between devices, streams of these packets are sent over the network and routed through intermediate nodes until they end up at the intended destination.
Even if they reach the destination out of order, they can be reassembled at the recipient’s end to form a meaningful communication instead of a garbled mess of nonsensical data.
To enable all of the networking magic that makes this possible, the packet is split into two main components:
- The first is the header, which contains metadata about the packet, such as the source and destination IP addresses, protocol type, and packet size.
- Then there’s the payload, which is the actual content of the data being transmitted intended for an application on the recipient’s end, such as a web page request, email, or video stream.
Traditional firewalls use stateless packet filtering, which examines only the header of each packet. Decisions to block or allow the traffic are made based on simple matching rules like IP address, port number, or protocol type.
While most network tools analyze only the header, DPI-enabled tools inspect both the header and the payload.
This enables network admins to identify not just where the data is going, but also what kind of data is being transmitted. DPI goes one step beyond stateless packet filtering in terms of the sort of rules that can be applied, allowing a network admin to apply advanced content filtering rules based on the contents of an HTTP request, or even just prioritize certain types of traffic based on the application in use.
What is DPI used for?
DPI has a wide range of applications, from cybersecurity to content control. Below are its most common uses:
1. Malware and threat detection
DPI is built into some firewalls and intrusion detection systems to analyze traffic and spot malicious patterns. By identifying signatures of malware, ransomware, or phishing attempts, DPI can block threats before they cause harm.
For example, a DPI-enabled firewall could recognize a string inside a packet that’s associated with botnet command and control, and then automatically block any further communications with the IP address associated with it or silently raise a flag for incident response while allowing the traffic to pass.
2. Preventing corporate data leaks
Organizations use DPI to monitor outgoing traffic for unauthorized data sharing. This ensures that sensitive information, such as intellectual property or customer data, isn’t leaked.
It’s particularly useful for identifying data exfiltration using protocols that are otherwise whitelisted by an organization, such as Skype, Discord, or cloud-sharing platforms like OneDrive, which otherwise wouldn’t raise an alert using traditional stateless firewalls.
3. Compliance with privacy regulations
As a logical follow-on, businesses in regulated industries can also use DPI to ensure compliance with data protection laws.
For example, DPI can be enabled to flag attempts to transmit sensitive data, helping companies avoid accidental breaches of GDPR or HIPAA rules by employees.
4. Parental controls
DPI is also extremely useful for enabling filtering of inappropriate or harmful content. Most traditional parental control systems work using a proxy server or a DNS blocklist, where sites that host harmful content are identified and blocked ahead of time instead of dynamically.
Instead, DPI can scan HTTP requests to identify individual keywords that suggest harmful content and block access to sites in real-time, making it a valuable tool for schools, parents, and guardians who want to create safer browsing environments.
5. Traffic prioritization
ISPs use DPI to manage network traffic. By identifying protocols that suffer in quality from excessive delay, such as streaming services or VoIP calls, these traffic streams can be prioritized for immediate transport by network devices over less urgent activities like file downloads.
This helps to ensure a smooth experience when you use real-time applications even in congested networks.
6. Blocking unlawful downloads
Conversely, ISPs can use DPI to identify illegal file-sharing over protocols like torrenting by matching on strings like movie and game titles.
Many ISPs also use DPI to deprioritize torrenting traffic, effectively throttling your bandwidth to make sure that other users aren’t slowed down.
Can DPI detect VPN usage?
While DPI is useful for protecting enterprise networks, it’s a double-edged sword if you’re using a VPN.
While expensive to implement, DPI can identify VPN usage in ways that traditional network filtering methods cannot. While this is understandable in the workplace or schools, the issue is that oppressive regimes with high levels of internet surveillance and censorship often use DPI in combination with other techniques to detect and block VPN traffic.
This ultimately makes it impossible to access alternate sources of news and sources of information like Wikipedia.
The most straightforward method involves identifying the specific ports that VPNs commonly use and blocking them, which can be carried out using a stateless firewall. However, this is easily defeated by specifying a different network port which most top-tier VPNs allow you to do from inside the client app.
Wondering which VPNs really, seriously, have your privacy in mind? Head on over to our guide to today's best secure VPNs.
ISPs also try to identify known VPN servers using a combination of open-source intelligence and probe packets, which they then add to a block list. All the traffic you try to send to a VPN server is then dropped, regardless of the protocol used.
Basic DPI uses protocol analysis to augment these techniques. VPN protocols such as OpenVPN and WireGuard have fields in their packet headers that are unique to those protocols, so identifying them via DPI allows an ISP to block them immediately.
To counter protocol-based DPI analysis, VPN developers have turned to obfuscation techniques.
The most common method is encapsulation, where VPN traffic is hidden within another protocol like HTTPS. This disguises VPN traffic by making it appear as regular encrypted web browsing.
However, DPI doesn’t just stop at header inspection. Advanced DPI systems, such as those employed by China's Great Firewall, can identify unique characteristics that suggest VPN traffic such as packet size and transmission patterns. These patterns make it possible to identify and block VPN usage even when encryption is applied.
Given that encapsulated traffic can be detected through metrics like timing and packet size, some VPN protocols insert junk packets between legitimate ones or vary the timing of requests and responses, disrupting the patterns characteristic of VPN usage. While these measures improve stealth, they also introduce significant overhead, reducing bandwidth and slowing connection speeds.
Ultimately, these trade-offs are worthwhile if the alternative of being unable to connect with a VPN at all. Despite the challenges posed by evolving DPI techniques, continued innovation by VPN developers ensures that users in highly censored environments can maintain access to secure and private connections.
Sam Dawson is a cybersecurity expert who has over four years of experience reviewing security-related software products. He focuses his writing on VPNs and security, previously writing for ProPrivacy before freelancing for Future PLC's brands, including TechRadar. Between running a penetration testing company and finishing a PhD focusing on speculative execution attacks at the University of Kent, he still somehow finds the time to keep an eye on how technology is impacting current affairs.