Real Words or Buzzwords: LMMs - AI's Security Operations Breakthrough
(Editor's Note: This article in the “Real Words or Buzzwords?” series examines how real words become empty words and stifle technological progress.)
In the first two paragraphs of her comprehensive technical blog post, Multimodality and Large Multimodal Models (LMMs), Chip Huyen, a computer scientist and AI book author, wrote the perfect introduction to this article. I inserted the words in square brackets.
“For a long time, each ML [machine learning] model operated in one data mode – text (translation, language modeling), image (object detection, image classification), or audio (speech recognition).
“However, natural intelligence is not limited to just a single modality. Humans can read, talk, and see. We listen to music to relax and watch out for strange noises to detect danger. Being able to work with multimodal data is essential for us or any AI to operate in the real world.”
The combination of Large Language Models and Large Multimodal Models is creating a sea change (it has already begun) in physical security’s predictive, proactive, and preemptive operational capabilities. This shift will also significantly alter physical security system design practices, particularly by evolving how we use scenario-based security system design.
Solutions capable of much of what I describe below already exist. Based on current trends in applying these AI models, this article may be considered “old news” as soon as a year from now. The security industry and its customers—security practitioners and their organizations—intend to learn how to use emerging AI best. As I noted in my article on the Physical Security Watershed Moment we’re experiencing, our thinking is the only limitation of our risk mitigation capabilities.
Better, But Not Good Enough
For 50 years, electronic security systems have been constrained by technological limitations. To compensate, we relied on people and processes. Most readers know that humans are both costly and subject to performance limitations. Adding more personnel doesn’t eliminate all security gaps.
The latest generation of AI-enabled video analytics has significantly reduced false alarms in video feeds. However, humans must still assess whether each detected person, object, or activity violates security policies or poses a risk. For example:
- An access-controlled door alarm still requires someone to review the triggered video feed and decide whether to dispatch an officer.
- Tailgating detection often occurs after the fact, necessitating officer intervention.
- Officers typically must investigate, determine if further action is needed (e.g., finding the offender, escorting them out), and assess whether an employee facilitated unauthorized entry.
Some security systems now employ AI-based analytics to predict and prevent tailgating at specific doors. For instance, a system might disable access momentarily and announce, “One entry at a time, please,” deterring further attempts. While effective, such systems operate at the individual door level without correlating repeated attempts across multiple doors. This lack of correlation can leave threat patterns unaddressed in real-time and after the incident.
New Sensemaking and Communications Capabilities
Modern LLMs and LMMs can process and analyze vast amounts of sequential or related data almost instantaneously, surpassing human capabilities in real-time and historical analysis. These capabilities enable game-changing advancements in incident response, including:
- Correlation Across Multiple Inputs: Integrate and analyze diverse data sources (e.g., text alarm logs, video, audio, and sensors) instantly as they occur to develop a coherent understanding of a situation quickly.
- Understanding a Situation: Synthesize and contextualize a broad range of inputs to infer patterns, relationships, and the broader context of unfolding or historical scenarios.
- Applying Rules or Policies: Use predefined safety and security rules, learned patterns, or programmed policies to evaluate situations, identify anomalies, assess policy applicability, and detect threats.
- Providing Plain-Language Explanations: Translate complex, multi-source data into clear narratives that describe the situation and its dynamics, generate template-based notices, and deliver reports tailored to intended recipients (including for native language).
These capabilities enable the following kinds of automated incident response actions.
- Inform Stakeholders: Notify stakeholders of events and recommend response actions based on policies, procedures, and personnel roles.
- Facilitate Supervisory Approvals: Allow supervisors to approve response actions instantly, armed with full situational awareness.
- Track Responder Activity: Monitor responder arrivals and provide real-time updates on progress.
- Conclude Responses: Identify when a situation is resolved and document the response.
- Incident Overwatch: Perform predictive analysis to anticipate developments, such as shift changes near incident areas and then, for example, generate notifications for affected employees to avoid the area and inform stakeholders of these actions.
- Generate After-Action Reports: Produce detailed summaries and timeline-based reports for review and improvement.
- Continuous Learning and Capability Enhancement: The models can update their learning from new data and incident outcomes, enabling them to refine their understanding of scenarios and improve future responses. This continuous improvement allows the models to adapt to evolving threats, enhance predictive accuracy, and identify previously undetectable patterns, helping ensure that security operations remain ahead of emerging risks. The system's value grows yearly without any specific user action required.
Example Force-Multiplier Scenario
An alert based on a series of tailgating events might read:
“An unauthorized individual in dark blue pants and a light blue shirt is attempting access to the R&D lab at multiple doors.”
This analysis combines access control and video surveillance data. Extending the scenario, AI could apply rule-based actions to examine recorded video for additional context. The resulting alert might state:
“The access card used in the tailgating attempts isn’t registered in the system. The individual drove onto the property 15 minutes ago in a red pickup truck parked in row 5. The person first attempted to tailgate into R&D behind Gretchen Smith, who denied entry.”
Using additional procedural response rules, the AI could prepare dispatch messages for two officers and submit them for supervisory approval. One message would direct an officer to the R&D lab, while the other would assign an officer to monitor the parked vehicle, noting that it currently has no visible occupants.
Without LLM and LMM capabilities, security would lack real-time situational awareness of such an event. Manual video review and assessment would likely trail real-time activity by 15 minutes or more, preventing timely intervention. Furthermore, security would have no immediate knowledge of an intended escape route unless additional manual video searches were conducted—by which time the intruder would likely have already left.
Many more variations are possible for this scenario, such as a rule-activated hallway talk down via camera audio output capabilities initiated by an AI-generated offer of assistance to the intruder and patching in a SOC operator for live communication.
Utilizing Advancing AI
Emerging AI capabilities provide several key opportunities:
- Leverage Existing Systems: Many existing security system deployments can be expanded or adapted for advanced AI functionalities.
- Maximize Resources: LLMs and LMMs act as force multipliers, enhancing security personnel and technology.
- Expand Risk Thinking: Traditional risk analysis thought processes remain applicable but can now be expanded to account for enhanced capabilities across the five Ds of security—Detect, Deter, Deny, Delay, Defend—to which I add a sixth: document.
Integrating LLM and LMM AI capabilities represents a transformative step forward for physical security operations. We can now envision more ideal security responses that increase the ROI of our existing security systems and our security teams.