Artificial intelligence (AI) and large language models (LLMs) can help threat intelligence teams to detect and understand novel threats at scale, reduce burnout-inducing toil, and grow their existing talent by democratizing access to subject matter expertise. However, broad access to foundational Open Source Intelligence (OSINT) data and AI/ML technologies has quickly led to an overwhelming amount of noise for users to sift through. Mandiant, by contrast, takes a more nuanced approach to fuse industry-leading expertise, unique proprietary data sources, and cutting-edge ML to enable a holistic and profoundly insightful view of your organization and its threat profile.
Ultimately, this means better personalization, scoring, and overall outcomes for our customers, but also a virtuous cycle of improved data collection and detections for Mandiant analysts. Now, as part of Google Cloud, Mandiant can bring the power of Google’s cutting-edge AI technologies to bear on the world’s toughest security and threat intelligence problems.
At Mandiant, our threat intelligence operations are based on the five phases of the Threat Intelligence Lifecycle, shown in Figure 1. The lifecycle shows the collection and progressive refinement of intelligence from raw data to actionable intelligence that holistically captures the threat landscape for our customers. AI is used at each stage in the lifecycle to enrich the data with detection information, extract critical details from unstructured data, normalize and categorize that data, prioritize the intermediate outcomes, and give customers the intelligence necessary to proactively defend against emerging threats.
In this blog post, Google discuss some of the exciting ways we are currently leveraging AI in the Threat Intelligence Lifecycle, as well as how Google Cloud’s Sec-PaLM 2 LLM can expand Mandiant’s threat intelligence capabilities. For more, learn about how we are working to secure AI technologies through our SAIF framework, about securing the AI pipeline, and about how Mandiant analysts and consultants are leveraging AI.
Figure 1: The Threat Intelligence Lifecycle. Mandiant brings AI technologies to bear on threat intelligence problems throughout the process. Those ML models benefit from novel, high-quality threat data for a virtuous cycle that benefits our analysts and customers.
Collection — Gather Information About Threat Activity
In the collection phase, Mandiant strives to be the “best threat telescope” by collecting threat intelligence data from various sources, ranging from Mandiant’s frontline intelligence gained from responding to over 1,000 breaches per year, to the Google Cloud SecOps services providing global telemetry, to the proactive threat data collection performed by our combined 500+ threat intelligence professionals.
This combination of hyper-focused analysis and global visibility puts Mandiant and Google Cloud in a unique position to be the first to identify attacks, while simultaneously gaining tremendous insight into attacker tactics, techniques, and procedures (TTPs). Not only does the collected data form the foundation of our threat intelligence products, it also allows us to train our AI models with high-quality examples, which in turn leads to a virtuous cycle of better detections and improved outcomes for customers.
Structure and Enrichment — Enhance the Analytic Value of the Data
Once we have collected the raw intelligence, the next step is to process and enrich the data to make it more useful for machine and human analysis. Mandiant runs dozens of unique AI models on data collected from botnets, cybercrime forums, and messaging services to categorize and annotate the extracted data with valuable enrichments, thereby saving analysts hours of manual examination.
As an example, Mandiant has more than a dozen AI models focused on extracting information from binary files, including malware classifiers for various file types, a ranking model to prioritize interesting Strings output, deep learning models that identify and explain function behaviors, and much more. For unstructured data, such as cybercrime forum posts, we use a state-of-the-art natural language processing (NLP) pipeline for entity extraction, topic classification, translation, and other tasks to provide the analysts with ready access to the most important information with minimal effort. These proprietary Mandiant analytic results are combined and cross-referenced with enrichments from OSINT feeds and third-party services to ensure analysts and our customers have access to unparalleled breadth and depth of information about the latest threats.
Analysis — Making Sense of the World
With enriched and normalized data in hand, our Mandiant analysts can now begin to piece together the broader picture from the individual pieces of intelligence. However, the scale of this visibility into the threat landscape can be overwhelming. Here, AI plays an important role in helping the analyst triage and prioritize the information available to them, and ultimately, to speed up and scale their ability to put the puzzle pieces together.
Mandiant Threat Intelligence offers several scoring models for indicators of compromise (i.e., hashes, IP addresses, etc.) and even alerts generated by our Digital Threat Monitoring product. These models dramatically decrease noise and irrelevant alerts, with a 96% reduction in false positives during alert generation and a 97% IOC reduction when filtering on high confidence indicators. Mandiant analysts also leverage AI technologies to help them understand the overlap of threat actor TTPs and highlight opportunities to improve attribution. These analysis-focused AI solutions help them hone in on what matters most and scale their expertise by minimizing repetitive work that leads to burnout.
Disseminate and Deploy — Operationalize Intelligence to Proactively Detect Threats
The outcomes from the analysis process make their way to our customers in a number of ways that improve their ability to detect and stop novel threats. Mandiant disseminates strategic threat intelligence via analyst-curated reports and threat graph insights (e.g. Actor, Malware, Campaign & Vulnerability intelligence), while tactical intelligence is converted to machine-readable data (MRTI) and signatures for immediate use by customers and partners. Even here, AI is providing an opportunity to revolutionize intelligence-driven detection and drastically decrease time to detection. Mandiant Breach Analytics for Chronicle, for instance, leverages indicator scoring and enrichment models, along with end-to-end automated pipelines, to quickly put information gained from front-line Mandiant investigations to work for our customers. At the same time, Mandiant is working to personalize their threat intelligence to the needs of each customer by recommending relevant intelligence and customizing scoring based on each customer’s threat profile (e.g., industry, geographic region, vulnerabilities).
Planning and Feedback – Refine Future Threat Intelligence Collections
Mandiant incorporates feedback in a number of ways to ensure constant improvement in our technology and access to the best raw intelligence data. As noted earlier, the high-quality data about threat actors and their TTPs is used directly to train the AI technologies used throughout the Threat Intelligence Lifecycle, which leads to improved detections and better intelligence collection over time. Indirectly, the artifacts analyzed during the process and the analyst evaluation of those artifacts also lead to important findings that drive additional collections, whether that is new botnets, forums, or messaging channels to monitor. Most importantly, feedback from our customers, both from explicit and implicit feedback mechanisms in the Mandiant UI, ensures that our intelligence collections are aimed at those threats that matter most to our customers.
The Future of Mandiant Threat Intelligence with Google Security LLM
The wide adoption of LLM technology and the development of Google’s Sec-PaLM 2 will add a number of transformative capabilities to the Mandiant Threat Intelligence AI toolkit. For one, the generative capabilities of the LLMs and their ability to combine massive amounts of information make them uniquely capable in automating the generation of threat intelligence reports and summarizing existing threat intelligence information. Meanwhile, its conversational capabilities can be leveraged to make Mandiant’s knowledge accessible to a broader range of customers. The LLMs can even be used to help detect, analyze, and attribute threat actor activities, thereby saving precious analyst time and providing even non-experts with valuable insights.
As an example of the transformative power that LLMs can have on Mandiant Threat Intelligence, Figure 2 shows LLM-based summarization of a variety of threat intelligence artifacts. Notably, the flexibility of LLMs allows us to summarize finished intel reports alongside structured intelligence, and customize the scope and technical depth of the summary to different audiences. This and other uses of the Google Sec-PaLM 2 will help customers unlock even more value from Mandiant’s extensive collection of threat intelligence data.
Figure 2: LLM-based summarization capabilities on the Mandiant Threat Intelligence platform
Summary
AI already plays a critical role in how Mandiant processes and operationalizes threat intelligence today. Going forward, new LLM technologies will continue to expand those capabilities and supercharge our ability to leverage our rich set of intelligence to help customers stay ahead of emerging threats. By automating and optimizing various stages of the Threat Intelligence Lifecycle, organizations can augment the capabilities of their existing staff and make more informed decisions about their security strategy. Mandiant is at the forefront of this innovation, leveraging the power of AI and LLMs to provide organizations with the threat intelligence they need to better protect their assets and stay one step ahead of threat actors.