1. Search Engine Discovery
1) Operators
a. site: - Limits results to a specific website or domain (e.g., site:example.com -> find all publicly accessible pages on example.com)
b. inurl: - Finds pages with a specific term in the URL (e.g., inurl:login -> search for login pages on any websites)
c. filetype: - Searches for files of a particular type (e.g., filetype:pdf -> find downloadable PDF documents)
d. intitle: - finds pages with a specific term in the title (e.g., intitle:"confidential report")
e. intext: or inbody: - searches for a term within the body text of pages (e.g., intext: "password reset")
f. cache: - displays the cached version of a webpage (e.g., cache:example.com)
g. link: - finds pages that link to a specific webpage (e.g, link:example.com)
h. related: - finds websites related to a specific webpage (e.g., related:example.com)
I. info: - provides a summary of information about a page (e.g., info.example.com)
j. define: - provides definitions of a word or phrase (e.g., define:phishing)
k. numrange: - searches for numbers within a specific range (e.g., site:example.com numrange:1000-2000 -> find pages on example.com containing numbers between 1000 and 2000
l. allintext: - finds pages containing all specified words in the body text. (e.g., allintext: admin password reset)
m. allinurl: - finds pages containing all specified words in the URL (e.g., allinurl:admin panel)
n. allintitle: - finds pages containing all specified words in the title
o. AND
p. OR
q. NOT
r. * (wildcard) - represents any character or word
s. ..(range search) - finds results within a specified numerical range (e.g., site:ecommerce.com "price" 100..500)
t. " " (quotation marks) - searches for exact phrases
u. - (minus sign) - excludes terms from the search results. (e.g., siite:news.com - inurl:sports -> searches for news articles on news.com excluding sports-related content)
2) Google Dorking
- also known as Google Hacking
- Technique that leverages the power of search operators to uncover sensitive information, security vulnerabilities, or hidden content on websites, using Google Search
(1) Finding Login Pages
- site:example.com inurl:login
- site:example.com (inurl:login OR inurl:admin)
(2) Identifying Exposed Files
- site:example.com filetype:pdf
- site:example.com (filetype:xls OR filetype:docx)
(3) Uncovering Configuration Files
- site:example.com inurl:config.php
- site:example.com (ext:conf OR ext:cnf) -> searches for extensions commonly used for configuration files
(4) Locating Database Backups
- site:example.com inurl:backup
- site:example.com filetype:sql
2. Web Archives
1) Wayback machine
- revist the past and explore the digital footprints of websites as they once were.
- By entering a URL and selecting a date, you can view how the website looked at that specific point.
- It does not capture every single webpage online. It prioritizes websites deemed to be of culture, historical, or research value.
3. Automating recon
1) Reconnaissance Frameworks
(1) FinalRecon: A python-based reconnaissance tool offering a range of modules for different tasks like SSL certification checking, Whois information gathering, header analysis, and crawling. Its modular structure enables easy customisation for specific needs.
(2) Recon-ng: A framework written in Python that offers a modular structure with various modules. It can perform DNS enumeration, subdomain discovery, port scanning, web crawling, and even exploit known vulnerabilities.
(3) theHarvester: Specifically designed for gathering email addresses, subdomains, hosts, employee name, open ports, and banners from different public sources like search engines, PGP key servers, and the SHODAN database. It is a command-line tool written in Python.
(4) SpiderFoot: An open-source intelligence automation tool that integrates with various data sources to collect information about a target, including IP addresses, domain names, email addresses, and social media profiles. It can perform DNS lookup, web crawling, port scanning, and more.
(5) OSINT framework: A collection of various tools and resources for open-source intelligence gathering. It covers a wide range of information sources, inlcuding social media, search engines, public records, and more.
2) FinalRecon
- offers a wealth of recon information:
(1) Header information
(2) Whois lookup
(3) SSL certificat information
(4) Crawler
(5) DNS enumeration
(6) Subdomain enumeration
(7) Directory enumeration
(8) Wayback machine
yeon0815@htb[/htb]$ git clone https://github.com/thewhiteh4t/FinalRecon.git
yeon0815@htb[/htb]$ cd FinalRecon
yeon0815@htb[/htb]$ pip3 install -r requirements.txt
yeon0815@htb[/htb]$ chmod +x ./finalrecon.py
yeon0815@htb[/htb]$ ./finalrecon.py --help
usage: finalrecon.py [-h] [--url URL] [--headers] [--sslinfo] [--whois]
[--crawl] [--dns] [--sub] [--dir] [--wayback] [--ps]
[--full] [-nb] [-dt DT] [-pt PT] [-T T] [-w W] [-r] [-s]
[-sp SP] [-d D] [-e E] [-o O] [-cd CD] [-k K]
FinalRecon - All in One Web Recon | v1.1.6
1. Search Engine Discovery
1) Operators
a. site: - Limits results to a specific website or domain (e.g., site:example.com -> find all publicly accessible pages on example.com)
b. inurl: - Finds pages with a specific term in the URL (e.g., inurl:login -> search for login pages on any websites)
c. filetype: - Searches for files of a particular type (e.g., filetype:pdf -> find downloadable PDF documents)
d. intitle: - finds pages with a specific term in the title (e.g., intitle:"confidential report")
e. intext: or inbody: - searches for a term within the body text of pages (e.g., intext: "password reset")
f. cache: - displays the cached version of a webpage (e.g., cache:example.com)
g. link: - finds pages that link to a specific webpage (e.g, link:example.com)
h. related: - finds websites related to a specific webpage (e.g., related:example.com)
I. info: - provides a summary of information about a page (e.g., info.example.com)
j. define: - provides definitions of a word or phrase (e.g., define:phishing)
k. numrange: - searches for numbers within a specific range (e.g., site:example.com numrange:1000-2000 -> find pages on example.com containing numbers between 1000 and 2000
l. allintext: - finds pages containing all specified words in the body text. (e.g., allintext: admin password reset)
m. allinurl: - finds pages containing all specified words in the URL (e.g., allinurl:admin panel)
n. allintitle: - finds pages containing all specified words in the title
o. AND
p. OR
q. NOT
r. * (wildcard) - represents any character or word
s. ..(range search) - finds results within a specified numerical range (e.g., site:ecommerce.com "price" 100..500)
t. " " (quotation marks) - searches for exact phrases
u. - (minus sign) - excludes terms from the search results. (e.g., siite:news.com - inurl:sports -> searches for news articles on news.com excluding sports-related content)
2) Google Dorking
- also known as Google Hacking
- Technique that leverages the power of search operators to uncover sensitive information, security vulnerabilities, or hidden content on websites, using Google Search
(1) Finding Login Pages
- site:example.com inurl:login
- site:example.com (inurl:login OR inurl:admin)
(2) Identifying Exposed Files
- site:example.com filetype:pdf
- site:example.com (filetype:xls OR filetype:docx)
(3) Uncovering Configuration Files
- site:example.com inurl:config.php
- site:example.com (ext:conf OR ext:cnf) -> searches for extensions commonly used for configuration files
(4) Locating Database Backups
- site:example.com inurl:backup
- site:example.com filetype:sql
2. Web Archives
1) Wayback machine
- revist the past and explore the digital footprints of websites as they once were.
- By entering a URL and selecting a date, you can view how the website looked at that specific point.
- It does not capture every single webpage online. It prioritizes websites deemed to be of culture, historical, or research value.
3. Automating recon
1) Reconnaissance Frameworks
(1) FinalRecon: A python-based reconnaissance tool offering a range of modules for different tasks like SSL certification checking, Whois information gathering, header analysis, and crawling. Its modular structure enables easy customisation for specific needs.
(2) Recon-ng: A framework written in Python that offers a modular structure with various modules. It can perform DNS enumeration, subdomain discovery, port scanning, web crawling, and even exploit known vulnerabilities.
(3) theHarvester: Specifically designed for gathering email addresses, subdomains, hosts, employee name, open ports, and banners from different public sources like search engines, PGP key servers, and the SHODAN database. It is a command-line tool written in Python.
(4) SpiderFoot: An open-source intelligence automation tool that integrates with various data sources to collect information about a target, including IP addresses, domain names, email addresses, and social media profiles. It can perform DNS lookup, web crawling, port scanning, and more.
(5) OSINT framework: A collection of various tools and resources for open-source intelligence gathering. It covers a wide range of information sources, inlcuding social media, search engines, public records, and more.
2) FinalRecon
- offers a wealth of recon information:
(1) Header information
(2) Whois lookup
(3) SSL certificat information
(4) Crawler
(5) DNS enumeration
(6) Subdomain enumeration
(7) Directory enumeration
(8) Wayback machine
yeon0815@htb[/htb]$ git clone https://github.com/thewhiteh4t/FinalRecon.git
yeon0815@htb[/htb]$ cd FinalRecon
yeon0815@htb[/htb]$ pip3 install -r requirements.txt
yeon0815@htb[/htb]$ chmod +x ./finalrecon.py
yeon0815@htb[/htb]$ ./finalrecon.py --help
usage: finalrecon.py [-h] [--url URL] [--headers] [--sslinfo] [--whois]
[--crawl] [--dns] [--sub] [--dir] [--wayback] [--ps]
[--full] [-nb] [-dt DT] [-pt PT] [-T T] [-w W] [-r] [-s]
[-sp SP] [-d D] [-e E] [-o O] [-cd CD] [-k K]
FinalRecon - All in One Web Recon | v1.1.6