Ultimate Reconnaissance RoadMap for Bug Bounty Hunters & Pentesters
بِسْمِ اللَّـهِ الرَّحْمَـٰنِ الرَّحِيمِ
This article is targeting anyone who is a bug bounty hunter and penetration tester. The content of this article is not new, it is indeed available on the internet, but the way of delivering it is different.
Who Am I ?
My name is Ahmad Halabi. Founder of Cybit Sec and Currently working as a Senior Cyber Security Specialist in Dubai. I am a former bug bounty hunter listed in Hall Of Fames of 200+ Well-Known Programs and still ranking among the Top 50 hackers All-Time on HackerOne.
You can check my biography here: https://ahmadhalabi.net/biography
Brief Intro ::
Recently I was a speaker in two conferences (@Hack conference in Riyadh/KSA — Nov 2021) AND (RedTeam Security Summit Conference — Dec 2021) where I delivered two talks about Advanced Reconnaissance and Bug Bounty Hunting.
I received a lot of positive feedbacks after delivering these talks as well as requests to share their content. Thus I decided to write an article about my Recon Methodology that I used when I was doing Bug Bounty Hunting so I can help newbies understand how hackers look for vulnerabilities and at the same time share knowledge with other hackers about my perspective in approaching a target.
Ultimate Reconnaissance Roadmap ::
Early of 2020 I collected the most used Recon concepts and created my own strategy in a Recon Roadmap. And I used it in Bug Bounty Hunting and Penetration Testing Work.
Important to Know: The mindset behind searching for vulnerabilities is important as much as your technical skills. Doing Reconnaissance in an Advanced way is an Art!
What is Reconnaissance ?
Reconnaissance also known as `Information Gathering` and `Footprinting`, is the first step and action that hackers do when approaching a target to search for weaknesses and use the advantage of vulnerabilities found in Exploiting the target system.
Recon is a process of Gathering as much information as possible about the target, for identifying various techniques to intrude into the target system.
Recon Types :
- Active: Involves directly interacting with the system architecture and infrastructure. Like interacting with system’s traffic and requests or physically accessing the company’s area.
- Passive: Involves gathering Information without Direct Interaction with the Target. Like using Search Engines and Open Source Intelligences to gather information about the target system.
Note: Doing detailed Reconnaissance will allow hackers to reduce their focus area and draw the network map so they can exactly know where to focus their attacks and deploy their further enumeration and exploitation.
Advanced Recon and Web Application Discovery (ARWAD) ::
I called the Recon roadmap with the above name because it was an incredible strategy that allowed me to discover large number of bugs in a short time.
I will discuss below the concepts and mindset behind setting this recon roadmap.
ARWAD Roadmap ::
You can download the image from my GitHub repo: https://github.com/ahmad0x1/ARWAD/blob/main/ARWAD_Methodology.jpg
ARWAD Methodology ::
Whenever I get across a domain, I apply the following methodology to collect as much information as I can about it in order to increase the attack surface which will in order increase the chances of finding hidden information which will lead to find massive vulnerabilities.
Note: In this article I am focusing on the methodology and mindset more than tools. Because whenever you understand the way of thinking, using the tools is very easy.
- I go for collecting Base information.
- WHOIS Information: Useful to check information about domain owners (gather emails, phone numbers) and registration details.
- DNS Information: Very useful to understand the domain logistics and start predicting what vulnerabilities related to DNS you can look for.
- Acquisitions: Looking for companies acquired by the target domain will in order give you more domains to target, high chances to find more vulnerabilities.
Example: Google has acquired lot of companies, all companies that are acquired by Google are in scope. Imagine the number of Domains and IP Addresses that are in scope for Google.
- Going more advanced, we repeat the 1st and 2nd recon phase for all subdomains that we gather in the coming `Subdomain Enumeration` phase.
Going After Subdomains :
4. Subdomain Enumeration: With its both phases (Active & Passive). Using more than one tool for the same purpose is recommended where each tool might give unique results then gathering all together would give more accurate information.
5. Sort & Filter: You should separate resolving domains from the non resolving ones because after this phase you need to do further work with the resolve records.
6. Subdomain Takeover: With the help of phase 2 (DNS Information), we collect the resolving subdomains to check for subdomain takeover.
7. Extracting IP Addresses: This phase requires extracting all IP Addresses from the collected subdomains to use them later on in `Open Source Intelligences` and in `Port Scanning` phases.
8. Port Scanning & Banner Grabbing: Start massive scanning all the extracted IP Addresses to look for any weird/interesting protocols running behind the enumerated services. Then you can go searching for any exploit and attack depending on the results.
9. Open Source Intelligences: Engines like Shodan and Censys are very powerful when it comes to search for Vulnerabilities and Exploits. Use the collected IP Addresses and Subdomains in this phase.
10. Gather Live Hosts: Prior to matching resolved hosts, here I focus on hosts that have web related open ports (HTTP/HTTPS) which will help me in completing the next reconnaissance phases.
11. Sub-Subdomains: Checking for Sub-Subdomains by either using already written tools like `Altdns` or using your own techniques by brute-forcing subdomains using your own wordlists.
12. Content Discovery: Requires brute-forcing Directories, Files and endpoints. As well as Fuzzing parameters to identify exceptional responses that will help you in detecting further vulnerabilities. I always suggest studying your target and creating a customized wordlist for getting better results.
13. GitHub Search: Another valuable resource to look for secret and exposed information related to the target program, you can also understand the behavior of the program by analyzing their code and studying their structure which will lead to Critical Findings.
14. Nuclei: A great tool that I usually use to look for Common Vulnerabilities and CVEs. It is worth keeping it updated, customizing its templates to fit your needs and running it on massive subdomains.
15. WayBack Machine: Very powerful resource to extract cached information and lot of URLs and Endpoints to increase your chances in finding hidden endpoints and details.
From Wayback you can:
- Combine it with a tool called `gf` and extract patterns from endpoints to test for vulnerabilities like SSRF, SQL Injection and XSS Injection.
- Pass it to Nuclei for additional scan.
- Extract important extensions (pdf, db, xlsx, …) that might be cached or forgotten and at the same time contain sensitive information.
16. Extracting JS Files: From JS Files, you understand application, extract endpoints, look for credentials and leaked information, look for misconfigurations and permission checks, and much more. JS Files are very informative and important to be analyzed.
There are lot of free and open source tools for each phase that can be used.
Lot of hackers know most of these phases, but few who know how to use them properly.
ARWAD Program ::
I automated the whole Recon Roadmap in a software and used it for my work and my own purposes to save time.
You can see the blueprint below.
You can see an example about gathering Google Acquisitions using `ARWAD` Program.
I showed this as a proof that you can automate the whole Recon process which will save lot of time and avoid you doing the whole manual search process.
Automating Subdomain Enumeration ::
You can go simpler than creating a software if you don’t have enough development skills. Simply using Bash script, you can automate the impossible in Recon.
As a proof of concept. See how I automated subdomain enumeration in the below image.
A bash script that automates Subdomain Enumeration by doing the following:
- Use Sublis3r tool to gather subdomains passively.
- Use AssetFinder tool to gather subdomains as well.
- Use Amass Tool in Passive Mode.
- Use Amass Tool in Active Mode.
- Remove Duplicated Subdomains Records.
- Check for Resolving/Live Subdomains.
- Brute-Forcing Subdomains using Customized Wordlist.
- Gather Sub-Subdomains Passively.
- Remove Duplicate Sub-Subdomains Records.
- Check for Subdomain Takeover.
- Output the results in multiple files (resolved.txt | unresolved.txt | subsubdomains.txt | subtakeover.txt).
I usually launch this script on a VPS Server and let it run till it finishes, It takes 1–7 days depending on the Domain Scale. Yes it takes this amount of time if you are targeting a huge Company that has Massive Domains and IP Ranges.
This is just an introduction about Advanced Reconnaissance to show you how hackers do their Recon and how they search for vulnerabilities. Recon Topic is so big. I didn’t even go in deep neither discussed technical details, I just showed an introduction about my methodology. There are lot of information related to Recon that I didn’t mention that we can still discuss. We can even write articles about each Recon Phase.
I hope that you find the information in this article helpful and useful.
I will be happy to discuss further this topic, go deep and dig in advanced Reconnaissance if you are interested. Let me know!