The Dark Web: What is it and how can you access it
Unless youâ€™ve been living under a rock, chances are good that youâ€™ve heard of the â€œdark webâ€. Many view the dark web as something sinister that is all about online crime and illegal content â€“ like an online â€œMos Eisleyâ€ (a â€œ…wretched hive of scum and villainy.â€)
It turns out that the perception of it and the reality are (as is often the case with these things) quite far apart. In fact, the vast majority of deep web and dark web content is – though hidden from normal view – perfectly innocuous. However, because of the fact that the dark web is anonymous, private, and difficult to eavesdrop on, attackers and cybercriminals do use it as part of their toolkit to commit cyberattacks.
This means that, from a defense standpoint, it is valuable for us to understand something about it so that we can understand how it might play into the threat scenarios that we need to keep our businesses defended against.
With that in mind, letâ€™s do a deep dive on the dark web: what it is, how it works, and how you might use your awareness of it to better your security posture.
“Dark Web” vs. “Deep Web”
The first concept that is useful to understand is the so-called â€œdeep web.â€ By deep web, we just mean the portion of the Internet that is unindexed â€“ i.e., not pointed to by sites like Google, Bing, or other search engines.
Deep Web – The Library
Probably the easiest way to understand this is through an analogy to the physical world â€“ one that everyone will be familiar with. Imagine for a moment a physical book in a library. Library visitors wishing to find that book go to a central index â€“ for the sake of this example, letâ€™s assume the library has an â€œold schoolâ€ card catalog that records information about the bookâ€™s location so that anyone wanting to read it can go to the shelves and retrieve it. In that situation, what happens if someone removes the bookâ€™s entry from the card catalog? Say some prankster removes the card and tears it up.
As we know, the book doesnâ€™t disappear â€“ itâ€™s still in the same place it always was. But since the index record has been lost, new visitors donâ€™t know either that the library has a copy or where it might be on the shelves. Someone who knew exactly where that book was though could go right to it and read it just like they always have.
This is almost directly analogous to data (in some cases web pages but also other types of data as well) on the Internet. For web content, weâ€™re used to searching engines â€œcrawlingâ€ or â€œspideringâ€ content, and creating an index so we can find what we want. Normally, thatâ€™s advantageous since most sites want the content to be found by as many users as possible.
But what happens if thereâ€™s data that search engines donâ€™t know about? For example, a page that nothing else links to. In this case, when search engines crawl through looking for content to index, they wonâ€™t find this page since nothing else links to it. The data is still there â€“ just not indexed.
Now, whatâ€™s important to note about the deep web is that it potentially includes a lot of data. In fact, you might think about the deep web as a massive â€œsoupâ€ of data including everything from web pages to binary files to media (video and audio) to unstructured data and beyond (also consider checking out this career guide for data science jobs). It includes pages used only by developers to test new content, new sites that havenâ€™t been indexed, data on miscellaneous servers that happen to be Internet-facing, and so on. In fact, much of â€œdeep webâ€ data isnâ€™t placed there purposefully.
What is the â€œdark webâ€?
If the deep web is unindexed information, the dark web is that portion of that is accessible only via specific, defined channels and protocols. Now, while any content or data that is hidden from view in this way is technically what some call a â€œdarknetâ€, what people usually mean when they talk about the Dark Web are pages and data accessible via Tor.
You may also like: Cybersecurity Trends to Watch Out For in 2020
Tor, originally created as â€œThe Onion Routerâ€, is open-source software that implements a specific routing protocol that (among other things) allows users to exchange data anonymously. Tor works by creating virtual â€œpathwaysâ€ through a global network of routers where each node in the path only knows about the nodes immediately in front of and immediately behind them in the link. Meaning, nobody in the chain â€“ even the routers themselves â€“ know the source, destination, and content at the same time.
To illustrate this, take a look at the diagram below.
Shown on the graph are three main elements:
A point of origin (labeled â€œsourceâ€)
A destination (labeled â€œdestinationâ€)
A number of intermediate, router nodes (labeled â€œintermediate routing nodesâ€)
The source routes traffic through each of the routing nodes to reach the destination, taking a random path along the way. One connection might connect through using a given combination of nodes and another might do so with an entirely different set of nodes. For example, in the diagram below, weâ€™ve overlaid two of the possible paths that traffic might take (one in red and one in green). Yet a third connection might take a completely new path.
The way that this is technically enforced is by using encryption and creating encrypted â€œlayersâ€ for each node. The message is encrypted such that each node along the chain can decrypt only a portion of the data – enough for them to know where the next hop along the path is. As each router decrypts its layer and discovers the next â€œhopâ€ along the path, it forwards the contents along.
Like the layers of an onion, each layer needs to be stripped away to reveal the underlying message at the center – and since each hop only knows its immediate predecessor and successor in the chain, only the first router in the chain knows the source (but not the destination) and only the last router in the chain knows the destination (but not the source).
Illustrated graphically, the message is encapsulated multiple times: each encapsulation enabling a router to get to the next hop.
Like Matryoshka Dolls
A way to think of it is by an analogy to the physical world. If I wanted to mail a letter to you, how could I do it in such a way that would keep the post office from knowing that you and I are communicating? By analogy to this strategy, I could pre-address a letter to you, and put that letter inside another letter to someone else (letâ€™s call them â€œFriend 1â€) along with instructions to unpack and mail the contents.
Then I put the letter to Friend 1 inside another letter addressed to yet another person (â€œFriend 2â€) along with instructions to mail the contents. I could repeat this as many times as I like to create a package that, when mailed, would eventually get to you but where intermediate people along the way (not to mention the postal service) wouldnâ€™t know that you and I are communicating.
The post office can tell that Iâ€™m communicating with the first person in the chain – and your postal carrier knows you are communicating with the last person – but thereâ€™s no easy way to tell that itâ€™s really you and I talking to each other.
Now that you know the fundamentals of what the dark web is and how it works, letâ€™s talk about how attackers might use it. First and foremost, they can use the above strategies to communicate in ways that – because theyâ€™re private – are â€œsaferâ€ for them from law enforcement in the event that they wish to conduct criminal activity.
You may also like: How to Eliminate the Internet’s Info on You
For example, if they want to share data about potential victims, exchange tools or data, buy and sell illicit services, etc. Given this, you will find organizations out there that specialize in â€œdark web monitoringâ€ – i.e., services that (through either manual or automated means) look for information on the dark web where your organization might be involved. For example, if an attacker compromises your environment, steals information about your business or customers, and tries to store, sell, transfer, or otherwise exchange that information via an attacker community. For example, you might look for information about user accounts that have been compromised appearing in these channels.
Another implication is that there could be those on your internal network that are using Tor to access resources that they shouldnâ€™t while they are at work – for example accessing inappropriate or illegal content hidden by others. In this situation, you might choose to disallow such traffic unless individuals can provide authorization for why they require such access to perform their jobs (also consider checking out this career guide for cyber security jobs).
The point is, understanding what the dark web is and how it works makes it a lot less scary. Additionally, there is quite a bit of hype and hyperbole about the dark web (and deep web) being exclusively used for criminal activity. Is there a criminal activity that happens on the â€œdark webâ€? Sure. Hopefully though, as you can see by now, just because the content is not indexed (deep web), that does not make it criminal – and just because someone is using Tor, that doesnâ€™t mean itâ€™s definitely criminal either.
Want to know more about cybersecurity? Check out our cybersecurity guide.