How open-source tools help to investigate you online

Image credit: Dreamstime

We all leave a digital data trail on the web that journalists and investigators can use in their research. Here's how they do it.

Even as an open-source intelligence journalist, I am often surprised how much information is openly accessible if you are able to combine pieces of evidence on a person from various different sources.

For us, stories involve people. People can be subjected to online investigations the same way companies or events can. Nowhere was this clearer than during and after the recent Capitol Hill riots in Washington DC. One in-depth Wall Street Journal piece was able to produce detailed findings on some rioters, armed with nothing more than publicly available information.

But I talked about that subject in another comment piece. This week I want to concentrate more generally on advanced research tools for the individual - whether that means protecting yourself or searching for others.

One of the biggest goldmines for personal intel in my view is professional social media platform LinkedIn.  In previous investigations, it served us well in confirming and verifying where someone works, how long he or she worked at a firm and the ranks and job title they held. It’s a starting point for investigating organisations that may have committed mischief of some sort. I say 'may' because everything needs to be confirmed in the traditional way, via sources. Rarely does publicly information state plainly that companies do or did something wrong. But insiders or ex-employees may know more and Linkedin might be an excellent way to contact them.

Although the platform has increasingly restricted search options due to legitimate privacy concerns (by limiting advanced search operators and so on) there are still options left at your disposal. For instance, you can still engage in some employee validation. The tool hunter.io allows you to find all email addresses that belong to a domain name. If someone claims they work at a certain organisation, this might help to verify it. There is also a tool called the TheHarvester that some commentators believe produces better results for validating email addresses.

By using site-specific search operators on Google, you can find a person on Linkedin. For instance, I may find more specific search results for person X via ‘site:http://linkedin.com/in “name of X”’. I can also combine several searches into one and search for “a person” with search for a “job title” and “company”. There are also manual searches for LinkedIn (more about this here). If you have only an image of a person to go on, image search engines such as TinEye, Pimseyes, Yandex image search and others (some listed here) might yield results for LinkedIn. Browser inspection tools also allow you to learn more about someone; for instance exactly when they posted something. It's down to the exact second when he or she uploaded their content.

If you find a LinkedIn profile on Google, you will be reminded that you need an account to view it. But there are ways to circumvent that. By using ‘https://tr.linkedin.com/pub/dir/*name/*surname’ you can search and access profiles without an account.  

On social media platform Twitter, there are also ways to verify accounts. In one recent story we used Follow.me to understand what a specific handle discusses the most. We can do this for the Prime Minister’s account, which, to no-one's surprise, talks mostly about ‘coronavirus watch’ or the ‘vaccine’.

You may also want to understand how many 'fake' followers a certain handle has. Results must be taken with a pinch of salt and a bit of healthy suspicion. But one tools is called Sparktoro. Enter your handle of interest and check the quality of its followers.

Instead of going through every tool on social media, you may have an email address. Now you want to understand where the owner of it is listed across the social media landscape. Three tools you could use are Lullar, Spokeo or Email Qualifier. You provide an email address and the tools suggest social media accounts with which it is associated (a few other tools are mentioned in this guide). Once you know where someone is listed, you can follow up with tools for various social media platforms. Open-source intelligence journalism outlet Bellingcat collected a whole range of tools here and another recourse is called Osintframework.

There are loads of other tools out there. One extreme cases where journalists analysed personal information, is the Wirecard scandal. A manhunt for Jan Marsalek, former chief operating officer at the company, led journalists to use open data to investigate an individual. It’s a timely example. This week there were reports that a former agent with the Austrian BVT, the domestic intelligence agency is being accused of illegally accessing classified information.

Interpol now features the 40-year-old Marsalek on its legendary Most-wanted list. Other intelligence agencies signalled interest in him. Bellingcat, in collaboration with others, put out a piece last July in which it presented the open-source intelligence case on how he might have vanished in thin air.

Marsalek didn’t wait to get arrested. He left false clues to where he might be hiding. That includes buying unused plane tickets and hiring someone in the Philippines to falsify immigration records to make investigators think he was in Asia.

But then he gave an essential clue in an interview. Replying to a journalist’s question about whether he was hiding in a “politically stable environment,” he responded: “Do not worry, the same people have been in power for the last 25 years,” Bellingcat reported. True to the exclusion principle, there are only a few jurisdictions that fit this description. One is Belarus. Private jet records suggested he might have entered Belarus on June 19. The journalists checked records of flight tracking websites FlighRadar24, Flighstats and FlightAware. Data from FlightRadar suggested a plane might have landed in Tallinn. From there on, journalists attempted to re-assemble possible travel plans. In short, wherever you go, you likely leave some sort of trail.

A short note on the open-source data you find that you want to publish. Ethics dictate you should not dox someone. Doxing or Doxxing is a form of Open Source Intelligence but it shouldn’t feature in any open-source journalism investigations. It’s the short form for “dropping documents” and expresses the compilation of a dossier against someone and publishing it online. 

People do it with the aim of exploiting, harassing or threatening someone. Doxing can be a serious crime. Tanya Basu published an interesting piece earlier in January about the tools and principles ethical online investigators need to follow. It can lead to devastating results if people are wrongly mislabelled, such as the 'online witch hunt' for a missing Brown University student who was wrongly suspected of being a possible suspect in the Boston Marathon bombing.

You might have never heard of any of these tools stated above and arrive at a conclusion that all this is a bit creepy. Yes, it can be. But don't fret. You can protect yourself to some extent. Also, bear in mind that law enforcement services as well as potentially your current employer might have used some of these techniques to understand more about you. After all, it’s open data. It might not harm you directly but whatever is on the internet can potentially help someone else to form an opinion about you. So follow basic data hygiene rules or, in more extreme cases, use open-source intelligence tools to remain protected and ‘anonymous’ on the web.

Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.

Recent articles