Open In App

Photon Scanner – Web Scraping OSINT Tool

Last Updated : 17 Jun, 2021
Improve
Improve
Like Article
Like
Save
Share
Report

Gather information on online targets can turn out to be a very time-consuming activity, especially when one only needs specific information about the target and not his entire biodata. Suppose you need specific pieces of information about a target with lots of subdomains, then one needs a tool that is capable of doing the heavy lifting, sifting through URLs on our behalf to retrieve information that can be of value. So Photon an OSINT scanner perfectly fits the profile here.

Key Feature:

1) Data Extraction: Photon can extract the following data while crawling:

  • URLs (in-scope and out-of-scope)
  • URLs with parameters. (example.com/gallery.php?id=3)
  • Intel (emails, social media accounts etc. )
  • Files (png, jpeg, pdf etc.)
  • Secret keys (API keys and hashes.)
  • Subdomains and DNS related data.
  • Strings matching custom regex pattern.

Note: The extracted information is saved in an organized manner and it can be exported as JSON.

2) Flexible: Control timeout, delay, add seeds, exclude URLs matching a regex pattern, and other cool stuff. The wide range of options that Photon provides lets you crawl the web exactly the way you want making it a powerful tool.

3) Plugins:

  • wayback
  • dnsdumpster
  • Exporter

Requirements:

python3: To check if your system has Python installed, just open a terminal window and type python3. If your screen looks like the screenshot shown below after typing the python3 in terminal then good, otherwise you can install it with apt-install python3.

Fig1: Python3 is installed in our system.

Using Photon:

Step 1: After installing python3 if not installed before, then we also need to install some dependencies. We can simply do it by typing in the following command in our terminal window:

pip install tld requests

After completion, move on to the next step.

Step 2: To install Photon, type in the following commands in your terminal window.

 git clone https://github.com/s0md3v/Photon.git

 cd Photon

Step 3: Now, we can run  python3 photon.py -h to see the list of options photon provides us that we can use.

python3 photon.py -h

Fig2: Optional arguments.

The most basic scan one can run is python3 photon.py -u target.com.

Step 3: Let’s use one of the most useful and interesting features of Photon, which is the ability to generate a visual DNS map of everything connected to domain. To do this, simply we will run a scan with –dns flag. Suppose we want to generate a map of geeksforgeeks.com, for this we will run the following command:

python3 photon.py -u https://www.geeksforgeeks.com/ –dns

Fig3: www.geeksforgeeks.com used as target.

Here, we can see that the total time taken was only 1 second, if you just throw a random URL that isn’t accurate then the photon scanner will take longer time. Also, we can see how many requests per second were made, which in this case is 2.

Fig4: geeksforgeeks.com files generated by photon.

Fig5:DNS map.

The DNS map shows that it has a very simple domain structure consisting of 0 subdomains.

Let’s zoom in and look at the MX record (where MX stands for “mail excahnger”),  responsible for email services. It can be seen that it uses 10 mx76.m2bp.com which is a mail server running on port 443, 80.

Fig6:  Showing mail exchanger  of geeksforgeeks.com

All of this can turn out to be not less than a goldmine for hackers looking for the most vulnerable system connected to the target.

Step 4: Now, let’s grab some email addresses and keys from a website, say geeksforgeeks.org.

Now, we’ll add a few more flags to increase the depth and speed of the search. In terminal , type the following command:

python3 photon.py -u https://www.geeksforgeeks.org/ –keys -t 10 -l 3

Fig7: Email of geeksforgeeks.org

We can see their email is feedback@geeksforgeeks.org. Also, 3 in the command specifies that we want to go three levels deep of URLs, whereas 10 specifies we want to open ten threads to do the crawling.

We will get some email addresses. We were doing a pretty wide net for this search, so there may be many unrelated emails on our list. This is because we scraped three levels of URLs deep and likely scraped some unrelated websites.

Photon Makes Scanning Through URLs Lightning-Fast:

When it comes to crawling through hundreds of URLs for information, it’s very time-consuming. Photon makes it easy to crawl large amounts of subdomains or several targets, allowing you to scale your research during the recon phase. With the intelligent options built-in for parsing and searching for kinds of data like email addresses and important API keys, Photon can catch even small mistakes a target makes that reveal a lot of valuable information.


Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads