Often, developers (perhaps due to their inattention) forget files in the root folders that can serve as a tool for attackers to hack a site or steal important information. These can be database copies, configuration files, or even source code files. Bug hunters periodically discover such vulnerabilities and send reports to bug bounty programs (programs for finding vulnerabilities for a fee).
The most dangerous files according to ChatGPT
The root folder is the part after the first slash in the site address. In simple terms, the attack on the root of the site looks like this: https://example.com/[part_of_the_URL_looked_by_the_intruder]. Its effectiveness directly depends on the list used by hackers (in our case, security researchers). The more relevant and complete the list, the higher the chances of finding the files left by the developers.
If you directly ask ChatGPT for the names of potentially dangerous files, it will not respond. The conditions of OpenAI, the developer of the neural network, prohibit its use for malicious purposes, so ChatGPT will never teach hacking. However, if hackers get creative, they can get some useful information, in particular a list of the most important and common files in the root of sites. The list generated by ChatGPT for this request contains 5,000 files, including config.backup (which can store important site configuration information) and test-odbc.php (a file with testing the connection to the database).
Some of the files it contains may have already been mentioned in the published fuzz.txt lists, and the researchers, as a rule, compiled them manually. This proves that the neural network has the right train of thought. For example, there is a similar white hacker Bo0oM sheet on GitHub. ChatGPT-fuzz.txt matches it by no more than 354 lines, otherwise it contains new files that were not previously seen by bug hunters.
The uniqueness of this list is that ChatGPT provides information based on the analysis of a huge array of data from the Internet. A person physically cannot analyze the same amount of information as a neural network.
Developers have to check sites before hackers arrive
It is often said that recognizing a problem is half the solution. Website developers should think in advance about what they store in the root folder, and already now remove unnecessary files from there, and hide the necessary ones better. This will help prevent problems in the future. For example, what could be dangerous about leaking logs? Let's look at the example of a real case, which we managed to find using ChatGPT-fuzz.txt.
In this case, the webhook_sms_log.txt file "leaked" the personal data of users, including mobile numbers and home addresses. One can only guess how attackers would use this information if they discovered the vulnerability first.
In addition, developers should not use predictable filenames such as test.php or config.txt. Another recommendation that is relevant for both developers and information security specialists is to periodically re-audit sites. If you are responsible for the security of services, then you must tell the developers what the consequences for the company may be if they do not follow simple rules.
The potential of ChatGPT has not yet been fully exploited. Only now, information security specialists have an understanding of what kind of tool it is and how it can be used in work tasks. In addition to the names of the most dangerous files, using the neural network, you can determine popular headers, cookies, site configuration settings, and much more.
It is important that as many bug hunters as possible start relying on lists similar to ChatGPT-fuzz.txt when searching for vulnerabilities. This will increase the security of the services that we use every day, and thereby make our world safer.