Path Traversal Attack and Prevention
A path traversal attack allows attackers to access directories that they should not be accessing, like config files or any other files/directories that may contains server’s data not intended for public.
Using a path traversal attack (also known as directory traversal), an attacker can access data stored outside the web root folder (typically /var/www/). By manipulating variables that reference files with “dot-dot-slash (../)” sequences and its variations or by using absolute file paths, it may be possible to access arbitrary files and directories stored on file system including application source code or configuration and critical system files.
Let’s assume we have a website running on
Let’s also suppose that the web server is vulnerable to path traversal attack. This allows an attacker to use special character sequences, like ../, which in Unix directories points to its parent directory, to traverse up the directory chain and access files outside of /var/www or config files like this.
A typical example of vulnerable application in PHP code is:
Using the same ../ technique, an attacker can escape out of the directory containing the PDFs and access anything they want on the system.
A possible algorithm for preventing directory traversal would be to:
- Giving appropriate permissions to directories and files. A PHP file typically runs as www-data user on Linux. We should not allow this user to access system files. But this doesn’t prevent this user from accessing web-application specific config files.
- Process URI requests that do not result in a file request, e.g., executing a hook into user code, before continuing below.
- When a URI request for a file/directory is to be made, build a full path to the file/directory if it exists, and normalize all characters (e.g., %20 converted to spaces).
- It is assumed that a ‘Document Root’ fully qualified, normalized, path is known, and this string has a length N. Assume that no files outside this directory can be served.
- Ensure that the first N characters of the fully qualified path to the requested file is exactly the same as the ‘Document Root’.
- Using a hard-coded predefined file extension to suffix the path does not limit the scope of the attack to files of that file extension.
This article is contributed by Akash Sharan. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.