Using search engines to do your job for you
Search engines can produce an absolute overload of information if not used efficiently. Not only can you find information about the financials of your targets, but also information about key employees, usernames and passwords, confidential documents such as network diagrams, information indicating what types of software or hardware you use or have in place, and even if systems are in a default state. This information can be devastating in the wrong hands. As a penetration tester your focus should be to bring this type of information forth and show the clients how it can be used to gain access to the clients' most critical assets (and hopefully, you will tell them how to fix the problem as well!).
Tip
There are search engines that cache information for quick access, and there are search engines that will archive sites and documents for years on end. There are even search engines that focus strictly on networking equipment such as wireless access points or publically facing routers, switches, servers, and more.
SHODAN
We will continue our footprinting reconnaissance efforts with Shodan. This search engine is specialized in indexing the information found in banners served by devices attached to the Internet. The search engine primarily indexes finding from port 80, but also indexes some Telnet, SSH, and FTP banners. SHODAN is a web application and can be accessed by going to http://www.shodanhq.com.
With Shodan you can find information on devices connected to the Internet. In addition to allowing you to search by IP address or hostname it also allows you to search by geographical location. Exporting the search results into XML is a premium feature which would require you to purchase credits. There is an example export available if you want to build a transform for MagicTree or some other data centralization tool before you decide if you want to spend money on the export.
There are several free filters that make narrowing the searches down much simpler. Most filters use the same format: searchterm filter:{filterterm}; an example would be a search for IIS 6.0 os:"Windows 2000
". These filters can also be used in conjunction with each other in order to pull some very interesting results.
Here is a listing of several important filters:
- net: Possibly one of the most useful filters for a penetration tester. You can search your IP ranges using IP/CIDR notation (for example, 127.1.1.0/24) to see if all of your devices are configured as expected, or if there are indicators that a vulnerable server or network device configuration is externally facing and ready to be compromised during testing.
- city: This will limit the search to the city listed.
- country: Restricts the search to devices in the country of choice. This is also very important for pentesting, as there may be times when a client provides you with IP ranges (which you validated, right?), and then places certain assets out of scope due to location. A client may chose to not test against systems located in Singapore for instance.
- port: Will restrict the search to the port indicated. Remember that SHODAN does not scan and index banners for all ports, only for 80, 21, 22, and 23.
- before: Search for systems scanned before a specified date.
- after: Search for systems scanned after this date.
- os: Which operating systems do you want to include or exclude in your search?
In order to perform affective searching in Shodan you must have some understanding of the types of banners that are indexed and what sort of information they typically contain.
FTP, Telnet, and SSH banners will vary, but each will provide useful versioning information.
Banners can be collected by using nc example.com:80
and then typing HEAD / HTTP/1.0
which results in the typical banner format you will see in your SHODAN results. As the HTTP banners are often the most difficult to understand we walk through some of the commonly found sections:
root@bt:~# nc example.com 80
Trying 192.168.1.1...
Connected to example.com.
Escape character is '^]'.
HEAD / HTTP/1.0
HTTP/1.1 200 OK
Content-Length: 9908
Content-Type: text/html
Last-Modified: Tue, 11 Oct 2011 02:35:17 GMT
Accept-Ranges: bytes
ETag: "6e879e69be87cc1:0"
Server: Microsoft-IIS/7.5
X-Powered-By: ASP.NET
Date: Sun, 16 Oct 2011 02:08:55 GMT
Connection: close
Connection closed by foreign host.
- The
HTTP/1.1 200
status code highlighted will provide a response to your query indicating the status of your request. In this case theHEAD/ HTTP/1.0
was accepted and processed successfully thus initiating a status code of200 OK
. Content-Length:
Indicated the length of the content in decimal number of OCTETs.Content-Type:
Will list the type of content being sent. Could be image/GIF, text/HTML, or other types.Accept-Ranges:
Indicates if the server will accept a byte range. Setting this to none will let the client know that range requests could be denied.ETag:
Provides the client with the current entity tag value.Server:
Will provide you with the version and type of software being used to service the request. This is one of the most important banner results for a penetration tester. Clients should be advised to hide this information. You will use this information to establish what attack types may be usable on the machine.X-Powered-By:
Flag is not a standard header, but can provide useful information to an attacker. It can also be changed or disabled completely.
Common status codes include:
Just as with most search engines the tool is extremely user friendly. To perform a basic search, simply type the search string into the input box at the top of the screen and you will be presented with a listing of results. You can search using any of the filters we have previously discussed, or you can try your hand at looking for specific banner fields.
Finding people (and their documents) on the web
In this day and age, everything is becoming interconnected. People are using their personal devices for work, sending out corporate e-mails using personal accounts on publicly owned mail servers, and watching lots of videos. One trend that has occurred over the years is that people have become so comfortable with the Internet that they are willing to share their information with unknown individuals and websites around the world. We will now discuss some of the methods you can use to verify that your clients are not unintentionally or intentionally leaking actionable or confidential data onto the public Internet.
There have been many books written on Google hacking, that speaking of the details and tricks involved would quickly divert the focus of this book.
Tip
If you are not familiar with Google hacking, perform a search for Johnny Long and visit his website at http://www.hackersforcharity.com, and check out The Google Hacking Database (GHDB), which was the original Google Dorks repository.
Exploit-DB at exploit-db.com
has taken over and updated Mr. Long's Google Dorks database. This is now the official GHDB site. You should use these tools in tandem with good filters to ensure that you get only the data you need. Here are some examples of how this can be done.
Go to http://exploit-db.com/google-dorks and choose a query. Here is a random entry:
inurl:ftp "password" filetype:xls
Enter it into Google.com
with the following modifications. Add the site:
option followed by a domain name that is part of your rules of engagement:
site:example.com inurl:ftp "password" filetype:xls
In the case of this example, if there are any results found, you have located a MS Excel file that contains some form of "password". Mind that results will vary and the best Google search queries are usually focused on determining the versions of installed software, seeking out known vulnerable installations that will later be targeted if allowed by the rules of engagement.
You should also be performing focused searches that locate all major document types such as .pdf, .doc, .txt, .xls
, and more. However, there are some additional tools that will help us with this.
Tip
Warning: Do not open random files on your primary testing machine. You should have a separate machine that is not connected to your network or the Internet that can be used to open unknown (that is potentially harmful) files and media. One of the easiest methods of gaining access to a machine is through sending a file to a user that uses exploits to open a system up to an attacker. Opening unknown files in an uncontrolled environment would be reckless. Don't be that user.
Metagoofil, a powerful metadata gathering tool created by Christian Martorella (http://www.edge-security.com), can be used to automate search engine document retrieval and analysis. It also has the capability to provide MAC addresses, username listings, and more.
BackTrack has the Metagoofil Blackhat Arsenal Edition installed by default. Open up a terminal and type the following:
# cd /pentest/enumeration/google/metagoofil
Metagoofil is a Python script and can be launched by typing:
# ./metagoofil.py
Which results in the following output:
************************************* * Metagoofil Ver 2.1 - * * Christian Martorella * * Edge-Security.com * * cmartorella_at_edge-security.com * * Blackhat Arsenal Edition * ************************************* Metagoofil 2.1: Usage: metagoofil options -d: domain to search -t: filetype to download (pdf,doc,xls,ppt,odp,ods,docx,xlsx,pptx) -l: limit of results to search (default 200) -h: work with documents in directory (use "yes" for local analysis) -n: limit of files to download -o: working directory -f: output file Examples: metagoofil.py -d microsoft.com -t doc,pdf -l 200 -n 50 -o microsoftfiles -f results.html metagoofil.py -h yes -o microsoftfiles -f results.html (local dir analysis)
Let's give metagoofil.py
a try on the example.com
domain:
# python metagoofil.py -d example.com -t doc,pdf -l 200 -n 50 -o examplefiles -f results.html
As a penetration tester you would want to find some documents that provide you all sorts of information about your client when running this tool. We do not currently have any such documents on the example.com
domain so the output is as follows:
************************************* * Metagoofil Ver 2.1 - * * Christian Martorella * * Edge-Security.com * * cmartorella_at_edge-security.com * * Blackhat Arsenal Edition * ************************************* ['doc'] [-] Starting online search... [-] Searching for doc files, with a limit of 200 Searching 100 results... Searching 200 results... Results: 0 files found Starting to download 50 of them: ---------------------------------------- tuple index out of range Error creating the file [+] List of users found: -------------------------- [+] List of software found: ----------------------------- [+] List of paths and servers found: --------------------------------------- [+] List of e-mails found:
As indicated in the preceding output, if this site had any information that was searchable via Google, it would have provided a nice HTML report of Usernames, E-mail Addresses, Software, Servers, and Paths. All of this is accomplished with one simple command sequence. You can change the variables to look for any documentation type that Google can find based on the filetype:
option.
Searching the Internet for clues
By now you should have some usernames, and possibly even some phone numbers and job titles. This information will come in handy if you are planning on performing a social engineering test.
Tip
Search engines such as Google can be used to search for information that corporate employees are dropping on the Internet as easily as you could search for a pie recipe. Be sure to verify that your client wants you to do research on employees before you start, not after. There are many laws that protect the privacy of an employee and only a lawyer can let you know what is and what is not acceptable.
One practice that seems to be prominent in penetration testing is to search for forum and group postings made by employees that may include information relating to work assets. Most of the information will not be shared with the world in a malicious manner, but rather innocently. This does not change the fact that attackers have access to this information and could possible use it against a targeted company. Look for things such as an administrator of the company asking for help on configuring a specific firewall type, or other network devices. A security professional that posts questions on a public forum may be unintentionally providing clues as to which standards their company complies with. These are the types of information that gives both you the penetration tester, as well as an advanced attacker, the knowledge necessary to penetrate an otherwise secured environment.
Here are some tools that would assist you in finding more information:
Metadata collection
In this chapter, we have already touched upon metadata when discussing Metagoofil
. Metadata can provide very useful information to a penetration tester. Many users are not even aware that this information is being attached to their files. A good example of this would be the Exif data associated with different image formats. You can find out what type of camera was used, when the photo was taken, where it was taken if there is GPS data available at the time (phone cameras…), and much more. Pictures are not the only files that have this type of extensive data available. The same goes for PDF documents, and more. Foca is an excellent program with an intuitive user interface, and its usage is highly advised, but it is a Windows program and is difficult to install on BackTrack (although not impossible by any means!). Thus we will review other options that come preinstalled on our penetration testing toolkit of choice — BackTrack.
exiftool comes preinstalled on BackTrack 5 and can be used to list all of the Exif data associated with many file types. This tool is extremely powerful and allows you to export your results into many different formats, write to file metadata, and more.
We will use a picture named FotoStation.jpg
that is included at /pentest/misc/exiftool/t/images
for our first usage example.
To start exiftool you can open up a terminal session and type:
# cd /pentest/misc/exiftool
If you run the default exiftool
you will be presented with the tool help selection. It is quite extensive, so be prepared for a lot of reading. Here we initiate a simple check against FotoStation.jpg:
# ./exiftool t/images/FotoStation.jpg
This results in the following output:
ExifTool Version Number : 8.56 File Name : FotoStation.jpg Directory : t/images File Size : 4.2 kB File Modification Date/Time : 2011:04:30 05:32:11-04:00 File Permissions : rw-r--r-- File Type : JPEG MIME Type : image/jpeg Image Width : 8 Image Height : 8 Encoding Process : Baseline DCT, Huffman coding Bits Per Sample : 8 Color Components : 3 Y Cb Cr Sub Sampling : YCbCr4:2:0 (2 2) Original Image Width : 1536 Original Image Height : 1024 Color Planes : 3 XY Resolution : 38.626 Rotation : 90 Crop Left : 18.422% Crop Top : 24.458% Crop Right : 83.035% Crop Bottom : 77.817% Crop Rotation : 0 Application Record Version : 2 Edit Status : Edit Status Urgency : 1 (most urgent) Category : Cat Caption-Abstract : Caption *** Local Caption *** Local Caption Special Instructions : Special Instructions Object Cycle : Unknown (Afternoon) Original Transmission Reference : OTR Object Preview File Format : Unknown (Custom Field 01) Object Preview File Version : Custom Field 02 Object Preview Data : (Binary data 15 bytes, use -b option to extract) Document Notes : Document Notes Image Size : 8x8
We can see that this provides a tremendous amount of data, but nothing that could really be used for your penetration testing. Now let's try a different file format:
# exiftool t/images/FlashPix.ppt
This provides us the following:
ExifTool Version Number : 7.89 File Name : FlashPix.ppt Directory : ./t/images File Size : 9.5 kB File Modification Date/Time : 2011:04:30 05:32:11-04:00 File Type : PPT MIME Type : application/vnd.ms-powerpoint Title : title Subject : subject Author : author Keywords : keywords Comments : comments Last Saved By : user name Revision Number : 1 Software : Microsoft PowerPoint Total Edit Time : 4.4 minutes Create Date : 2007:02:09 16:23:23 Modify Date : 2007:02:09 16:27:49 Word Count : 4 Category : category Presentation Target : On-screen Show Manager : manager Company : company Bytes : 4610 Paragraphs : 4 Slides : 1 Notes : 0 Hidden Slides : 0 MM Clips : 0 App Version : 10 (0972) Scale Crop : 0 Links Up To Date : 0 Shared Doc : 0 Hyperlinks Changed : 0 Title Of Parts : Times, Blank Presentation, Title Heading Pairs : Fonts Used, 1, Design Template, 1, Slide Titles, 1 Code Page : 10000 Hyperlink Base : hyperlink base Hyperlinks : http://owl.phy.queensu.ca/, http://www.microsoft.com/mac/#TEST, mailto:phil?subject=subject Custom Text : customtext Custom Number : 42 Custom Date : 2007:01:09 05:00:00 Custom Boolean : 1 Current User : user name
This is the metadata that you are looking for when testing. In this particular example, the information has been scrubbed for learning purposes but some fields of interest should include:
- Title
- Subject
- Author
- Comments
- Software
- Company
- Manager
- Hyperlinks
- Current User
All of this data starts to make a pretty picture when it is all combined in your data collection and centralization tool. You can use exiftool to pull or to write to metadata from Flash, PPT, and MANY more. You can obtain a complete listing of supported file types from http://www.sno.phy.queensu.ca/~phil/exiftool/#supported.