SANS Penetration Testing

The Secrets in URL Shortening Services

The secrets in URL shortening services

 

Authors: Chris Dale (@ChrisADale) and Vegar Linge Haaland (@vegarlh) created this post to bring awareness to the dangers of using URL shortening services. They both work at Netsecurity, a company which serves a multitude of customers, internationally, with services within networking, security operations, IR and Penetration Testing.

 

URL shortening services are common use for many today. They allow you to create convenient short URL's that serves as a redirect to otherwise long and complex URLs. Such a shortened URL can look like this: https://bit.ly/2vmdGv8 . Once clicked, you will be immediately redirect to whichever URL bit.ly has stored behind the magic value of 2vmdGv8. A team from archiveteam.org, called URLTeam, has taken on the task of discovering what secrets these magic values hold.

 

URLTeam's mission is to take these shortening services, i.e. https://bit.ly, https://tinyurl.com, and bruteforce the possible values they can hold for their magic values. Currently the team is working across a bunch of shortening services, all of which can be found here: https://www.archiveteam.org/index.php?title=URLTeam#URL_shorteners, currently about a 130 different services supported. Their efforts effectively reveal the destination redirect of the magic values, and we can very quickly find juicy information!

 

Some of the information one can find are e.g. URLs pointing to file shares otherwise un-discoverable, links to testing and staging systems and even some "attack-in-progress" URL's where we can see attacks against target systems. For penetration testers, searching through the bruteforced results might be a helpful addition to the reconnaissance process, as it might reveal endpoints and systems which would have previously gone undiscovered..

This blog post describes how to set yourself up with a setup for easy searching through this amazing set of data. That's what we got for you in this post. First, some analysis of the data available, and then some quick and dirty setup descriptions to make this amazing data-set available for yourself.

image3

 

Revealing secrets

Let's take a look at what type of sensitive, funny and otherwise interesting information we can find in URL shortener data. Below are some examples of the things we found by just casually searching, and utilizing Google Hacking Database (https://www.exploit-db.com/google-hacking-database/) type of searches. We have tried to redac the domain names in question in order to protect the companies vulnerable.

Hack in-progress

First out is the "Hack In-Progress" type of URLs. I am guessing black-hats, and perhaps others too, are using these to share their work in progress with their peers. Some of the identified URLs seem to be simple Proof-Of-Concepts, while others are designed for data exfiltration.

 

Below we see attacks trying to exploit a Local File Inclusion (LFI) by including the Linux file /proc/self/environ. This file is commonly abused to try get a shell from having found a LFI vulnerability.

Search termExample data
{"wildcard": {"uri_param.keyword": "*/proc/self/environ"}} image9

Then we can see some further expansions on the LFI vulnerabilities, using a Nullbyte (%00) to abuse string terminations, a common vulnerability on e.g. older PHP installations.

Search termExample data
{"wildcard":{"uri_param.keyword":"*%00"}} image15

We also see attacks containing carriage-return (%0D) and line-feeds (%0A). In this example we see an attempted command injection, trying to ping the localhost 5 times. The attacker will measure the time taken from a normal request, i.e. without attack payload in it, vs. the request below. If the time taken for the request to complete is approximately 5 seconds higher using the attack, it confirms the attack vector and a command injection is very likely to be possible.

Search termExample data
{"wildcard":{"uri_param.keyword":"*%0a"}} image19

And finally, of course, SQL Injection queries.

 

Search termExample data
"Union all" image20

 

Backup data

Next up in line of examples is backed up data. Many developers and IT-operators make temporary backups available online. While sharing these, it is evident that some of them have used URL shorteners to make life more convenient. This vulnerability classifies as a information leak.

 

Search termExample data
{"wildcard": {"uri_path.keyword": "*.bak"}}

 

image6
{"wildcard":{"uri_path.keyword":"*.sql"}} image18

File sharing

Another common finding is people sharing files with no password protection, assuming that the random identifiers (magic values) will keep the documents safe. But when they use URL shortener they effectively share the URL with the world. In this case we're looking for Google Documents, and honestly there is so many sensitive documents leaked here, across a plethora of different file sharing services.

 

Search termExample data
uri_domain:docs.google.com

 

image14
uri_main_domain.keyword:mega.nz image16
uri_main_domain.keyword:1drv.ms image1

 

 

Booking reservations and similar

Cloud storage links are not the only passwordless links that are being shared online. Hotel and travel ticket systems also commonly send you emails with passwordless links for easy access to your booking details, as handily provided below:

 

Search termExample data
uri_domain:click.mail.hotels.com image8

 

 

 

Test, Staging and Development environments

There are also a lot of test systems being deployed, and links are often shared between developers. Searching for test, beta, development etc. often reveal some interesting results as well.

 

Search termExample data
uri_lastsub_domain:test image13
uri_lastsub_domain.keyword:staging image11

 

Login tokens and passwords

Session tokens in URL's have for years been considered as a bad practice, primarily because proxies and shoulder surfing attacks could potentially compromise your current session. In the dataset there were many examples of URL's containing sensitive info, e.g tokens and passwords.

Search parameterExample data
{"wildcard":{"uri_param.keyword":"*password=*"}} image10
{"wildcard":{"uri_param.keyword":"*jsessionid=*"}} image5

Conclusions

By looking at the archiveteam.org's data it is very evident that sensitive data is being shortened unscrupulously. It is definite that these services are being used without properly thinking through who else can gain access to the URLs, a classic example of security through obscurity. Obviously, someone with bad intentions could abuse this to their advantage, but it is also clear that organizations could utilize these same services to look for their own data being leaked, and possibly compromised. Furthermore, penetration testers could use these techniques to look for their customer data, providing potential valuable intelligence into attack surface they could work on.

 

What type of vulnerabilities and information leaks can you find? Please share in the comments below. In the next section we share how to get this setup running on your own server, so you can start analyzing and protecting yourself.

 

Setup and Installation

A detailed description of the configuration in use would probably require it's own article so to avoid making this one too long we will not go deep into the configuration in details but we will give a brief description of the components in use.

Our current setup involves two servers. One OpenBSD server which is used for fetching files and sending them to the database server, relaying https requests, and another running an ELK (Elasticsearch / Logstash / Kibana) stack. The ELK stack is running on FreeBSD with ZFS. It is composed of the following components:

  • Elasticsearch: The database.
  • Logstash: Log and data-parser. Parses the data to a common format and feeds it into Elasticsearch.
  • Kibana: A web server and frontend for Elasticsearch.

 

Setting up these services are a common practice today, and there exists a multitude of tutorials on how to accomplish this today.

 

On the OpenBSD server we have a script for downloading archives from archive.org. Files are decompressed and content is appended to a file which is monitored by Filebeat. Filebeat then sends the data to Logstash on the server which is running on the Elasticsearch server. Logstash is set up with some custom grok filters which splits up the URLs into separate fields. We created some handy fields to use for filtering and graphs. Logstash then sends the processed data to elasticsearch. The data is stored in indices based on their top domain (.com .org .no etc). This is done to speed up searches for specific top level domains. Below you can see the fields we have extracted from the URLs.

image4

 

As an example, uri_lastsub_domain contains the bottom level domain name. This is handy for when you want to filter on any sub-domains starting with "api" or "dev" etc.

 

After feeding the data to the database we can start searching and looking through the data.

We can use the Discovery page in Kibana to discover interesting data. We can also use the Dev tools page to make more advanced queries, e.g. regex searches for hashes.

 

Example of use:

Searching through archives from the last 60 days, including only .com domains using the string "(union AND select) OR (select AND concat)" yields an amazing 6,995 results as per the following screenshot:

image17

The Kibana Dashboard allows us to present some colorful and useful graphs:

image2

 

Performance and Speed

Currently, the DB server with the ELK stack runs on a low end VPS. Searching for "test" through all 83,927,323 .com records currently in Elasticsearch, via Kibana, takes 3-5 seconds. Speed will depend on the type of search though, so if we want to search through a specific field for a non-analyzed string through the same index it takes about 23 seconds. Still not too bad. This is particular useful when a search term includes special characters.

 

Search Options

In Kibana the main search options are the search bar in Discover and custom queries using dev tools. The former is good for quick searches, and provides a nice list of the results. The listing can be easily modified to include the fields you need and you can modify as you go. When working with results in the discover page you can also easily filter on specific field data, allowing you to very easily to filter out uninteresting data as you browse though the results. This is shown in the screenshot below:

image12

 

The console UI in the dev tools page is a great place to build up specific queries. It let's us query elasticsearch directly:

image7

Searching and processing data from the terminal

ElasticSearch results are returned in JSON format. This makes it easy to write tools to interact with the data, and also to process the data on e.g. a UNIX terminal. Let's say we want to get all results containing the string "../" and see how many different domains we get hits on. To do this we can simply query the elasticsearch server with a curl and pipe the output to a file. For example:

curl -s -X POST -H "Content-Type: application/json" 'http://localhost:9200/logstash-shorturls-.com/_search?scroll=10m&size=5000' -d '

{

"query": {

"wildcard":{

"uri_param.keyword":"*../*"

}

}

}' > res

 

This normally results in a huge block of json text, so we pipe it to jq and extract just the information we need:

cat res | jq .hits.hits[]._source.uri_domain > res.dom

Finally we summon the power of standard Unix tools to process the data:

cat res.dom | cut -d"\"" -f2 | sort | uniq | wc -l

873

This could obviously be done all in a single command, without storing any files, however it is just a trivial example to show you that we can easily process the data in whichever manner we want. Now that we have the domains we can do more fun stuff like:

grep -f res.dom bug-bounty-domains.txt

 

Happy hunting!

 

SANS Note:

Chris Dale has four upcoming SANS Pen Test Webcasts scheduled -

 

Upcoming SANS Special Event - 2018 Holiday Hack Challenge

KringleCon

SANS Holiday Hack Challenge - KringleCon 2018

  • Free SANS Online Capture-the-Flag Challenge
  • Our annual gift to the entire Information Security Industry
  • Designed for novice to advanced InfoSec professionals
  • Fun for the whole family!!
  • Build and hone your skills in a fun and festive roleplaying like video game, by the makers of SANS NetWars
  • Learn more: www.kringlecon.com
  • Play previous versions from free 24/7/365: www.holidayhackchallenge.com

Player Feedback!

  • "On to level 4 of the #holidayhackchallenge. Thanks again @edskoudis / @SANSPenTest team." - @mikehodges
  • "#SANSHolidayHack Confession — I have never used python or scapy before. I got started with both today because of this game! Yay!" - @tww2b
  • "Happiness is watching my 12 yo meet @edskoudis at the end of #SANSHolidayHack quest. Now the gnomes #ProudHackerPapa" - @dnlongen
kringle_02

Post a Comment






Captcha


* Indicates a required field.