Join Flashpoint CEO Josh Lefkowitz and Forrester senior analyst Josh Zelonis on Nov. 6 for a webinar titled “Collections, Confidence and Context: How to Assess a Threat Intelligence Vendor.” Our experts will discuss the challenges that accompany an evaluation of a threat intelligence vendor, and what components make up an effective strategy. Register here for the webinar.
A threat intelligence vendor’s collection strategy and data source coverage are important factors to consider when investing money and time with a provider. No intelligence operation succeeds without quality data fueling it, and nothing subverts a program’s capabilities, objectives, and requirements faster than data of little to no value.
In order to effectively evaluate a threat intelligence vendor’s collection strategies, it’s imperative that decision makers ask the right questions about the data sources that make up a vendor’s collection strategy, how that strategy maps to intelligence requirements, and how automation plays a role in a vendor’s collections.
What Data Sources are Behind Your Collection Strategy?
Vendors collect from sources located on the Deep & Dark Web (DDW) and open web but may confuse customers in describing these sources. Operational security mandates that collections specialists protect their access to sensitive sources, and are intentionally—and understandably—vague about certain aspects of collections. There are times, however, when some vendors are too vague and reveal little more than they are collecting from DDW and open web sources.
That amount of information is not enough to make a determination about data’s origin and value. While some of the more popular DDW marketplaces such as Dream Market are accessible to anyone with a Tor browser, for example, private DDW forums are highly exclusive, typically invite-only, and contain data that tends to differ substantially from that which is generally available from other types of sources in the DDW.
Vendors should granularly describe and categorize data sources, rather than speak too generally. Within in the DDW or open web are numerous data sources and highly differentiated data that can separate a failed intelligence operation and a successful one. These sources generally include:
- Private or invite-only forums
- Chat services platforms
- Illicit marketplaces
- Payment card shops
- Paste sites
- Social media sites
Given that DDW and open web sources tend to be poorly delineated in the market, it’s important to understand specifically what sources comprise a vendor’s collection strategy before you decide to become a customer.
Collection Strategy Meets Intelligence Requirements
Mapping a vendor’s collection strategy—no matter how much it’s marketed as comprehensive—to intelligence requirements (IRs) is a make-or-break consideration for buyers. IRs are foundational to the direction of an intelligence operation and will dictate the types of data and sources an effective operation requires. It’s crucial to establish IRs before evaluating vendors, and then cover them thoroughly with a provider in order to determine whether they have access to sources that map to your IRs. If so, dig deeper with follow-up questions such as:
- Which of your sources would be most suitable for my IRs and why?
- Should you lose access to those sources, are suitable backups available?
- What are some examples of how your collection strategy has supported customers with similar IRs?
- What are your collection strategy’s most substantial weakness or blind spots with respect to my IR?
Keep in mind that no vendor will have 100 percent coverage of each and every source that could satisfy your IRs and support your operation, but some vendors will have access to more and better sources than others.
Automation is OK, But Can Be a Red Flag
Automation is part of most vendors’ collection strategy, but when it dominates—or even plays too small a role—it could throw up a red flag for an intel operation. In general, sources that are easier to access, are easier to collect data from automatically. Paste sites on the open web, for example, are freely and safely accessible to anyone, and vendors can and do collect data from them in an automated fashion.
But if a vendor claims to automate the entirety of its collections, it likely lacks the ability to access and/or accurately analyze data from certain types of highly vetted and unique sources. Private or invite-only forums, for example, are exclusive, extremely difficult to access, and therefore nearly impossible to collect data from automatically.
Because many of the adversaries who frequent these forums don’t operate in English, gaining access can only be done by human analysts with the necessary linguistic skills. And in many cases, simply being fluent in Russian, Arabic, Mandarin, Turkish, Farsi, Spanish, French, or other languages isn’t enough—analysts also need a keen understanding of the cultural nuances, social norms, idioms, and slang that exist within such communities. Despite promising advances in artificial intelligence and automation, such tools aren’t yet capable of mimicking the level of human expertise required to collect data from these types of sources.