Earlier this week, I did a quick experiment in the MSI Threat Lab. I wanted to see what happened when someone mentioned a URL on Twitter. I took a HoneyPoint Agent and stood it up exposed it to the Internet on port 80.
I then mapped the HoneyPoint to a URL using a dynamic IP service and tweeted the URL via a test account.
Interestingly, for the good, within about 30 seconds, the HoneyPoint had been touched by 9 different source IP addresses. The search engines, it seems, quickly picked the URL out of the stream, did some basic traffic and I assume queued the site for crawling and indexing in the near future. A few actually indexed the sites immediately. The HoneyPoint cataloged touches from 4 different Amazon hosts, Yahoo, Twitter itself, Google, PSINet/Cogent and NTT America. It took less than an hour for the site to be searchable in many of the engines. It seems that this might be an easier approach to getting a site indexed then the old visit each engine and register approach, or even using a basic register tool. Simply tweet the URL and get the ball rolling for the major engines. 🙂
On the bummer side, it only took about 10 minutes for the HoneyPoint to be probed by attacker scanning tools. We can’t tie cause to the tweeting, but it did target that specific URL and did not touch other HoneyPoints deployed in the range which certainly seems correlative. Clearly, search engines aren’t the only types of automated applications watching the Twitter stream. My guess is that scanning engines watch it too, to some extent, and queue up hosts in a similar manner. Just like all things, there are good and bad nuances to the tweet to get indexed approach.
Further research is needed in what happens when a URL is tweeted, but I thought this was an interesting enough topic to share. Perhaps you’ll find it useful, or perhaps it will explain where some of that index traffic (and scanner probes) come from. As always, your mileage and paranoia may vary. Thanks for reading!