Skip to content
6
LLM Platforms Supported
Ollama, LM Studio, vLLM, LocalAI, llama.cpp, text-gen-webui
7
Security Checks Per Service
From critical (no auth) to low (version disclosure)
5
Deep-Scan Detection Methods
Processes, ports, configs, env vars, Docker

The Problem

The local AI boom is real. Ollama has millions of downloads. LM Studio runs on every developer's laptop. vLLM powers production inference. But security hasn't kept up:

  • Ollama binds to 0.0.0.0:11434 with no authentication — anyone can pull models, read your prompts, and run inference on your GPU
  • LM Studio exposes an OpenAI-compatible API with no access controls
  • Most platforms leak their version, list all loaded models, and expose system prompts through their APIs

This tool scans for all of that in seconds and generates a detailed report with fix recommendations.

Quick Start

Three commands and you're scanning. No signup, no API keys, no cloud dependencies.

# Clone the repository git clone https://github.com/saikatz/local-llm-configuration-auditor.git cd local-llm-configuration-auditor # Install dependencies & run pip install -r requirements.txt python run.py

On Windows, you can also double-click start.bat

Scan Modes

Choose how you want to scan when you run the tool:

This Machine Deep-scans your local machine — processes, configs, Docker containers, env vars, plus extended port scan
Full Subnet Auto-detects your LAN and scans the entire /24 subnet for exposed AI endpoints
Custom Target Enter a specific IP address or CIDR range to scan
HTML Report Every scan generates a dark-themed HTML report saved to reports/

7 Security Checks Per Service

Each discovered LLM endpoint is audited against these checks:

# Check Severity Why It Matters
1 Unauthenticated API Access Critical Anyone can use your LLM without credentials
2 Unauthorized Model Pull Critical Attackers can download arbitrary models via Ollama's /api/pull
3 Model Enumeration High All loaded models are visible to anyone
4 System Prompt Extraction High Your model's system prompts and configs are exposed
5 Non-Localhost Binding High Service is reachable from outside your machine
6 CORS Misconfiguration Medium Access-Control-Allow-Origin: * enables browser-based attacks
7 Version Disclosure Low Software version exposed, useful for targeted exploits

Deep Local Machine Scan

When scanning localhost, the tool goes beyond port scanning with 5 detection methods — it even finds LLMs that are installed but not currently running:

Process Scanning Running LLM processes with PID and listening ports (via psutil)
Listening Port Harvest All ports in use by LLM-related processes, including non-default ports
Config File Discovery Installed platforms detected by checking known config paths and model directories
Environment Variables Platform-specific env vars (e.g., OLLAMA_HOST, OLLAMA_ORIGINS)
Docker Enumeration Containerized LLM deployments with port mappings
Dormant Detection Flags installed LLMs that aren't running — they may still contain model files worth reviewing

Supported Platforms

Ollama :11434

Most popular local LLM runner. Default config exposes API to all interfaces with zero auth.

LM Studio :1234

Desktop app with OpenAI-compatible server. No built-in access controls.

vLLM :8000

High-performance inference engine. Often deployed in production without auth.

LocalAI :8080

Drop-in OpenAI replacement. Supports multiple model backends.

llama.cpp :8080

Lightweight C++ inference server. Minimalist, often without security features.

text-generation-webui :7860

Gradio-based UI with API. Multiple exposed interfaces and ports.

Features

One-command scan — just run python run.py and pick a target
Smart identification — fingerprints services through API probing, HTTP headers, OpenAPI docs, and model metadata
Color-coded terminal output with step-by-step progress and severity breakdown
Dark-themed HTML report — detailed, professional, ready to share with your team
Minimal dependencies — just requests + optionally psutil
Full debug logging saved to logs/ for review

Authorized Use Only

This tool is for authorized security auditing only. Only scan networks and systems you own or have explicit written permission to test. Unauthorized scanning may violate local laws. The authors are not responsible for misuse.

Frequently Asked Questions

Why should I care about my local LLM being exposed?
Most local LLM platforms ship with zero authentication and bind to all network interfaces by default. This means anyone on your Wi-Fi — coworkers, guests, or anyone in range — can query your AI, read your prompts, download models, and run inference on your hardware. An exposed endpoint is essentially an open door to your AI infrastructure.
Does this tool send any data to the internet?
No. Everything runs locally on your machine. The tool only makes network requests to the scan targets you specify (your localhost or your local network). No telemetry, no cloud APIs, no data exfiltration. The source code is fully open and auditable on GitHub.
What do I need to run it?
Python 3.8 or higher and the requests library (installed via pip). For full deep-scan capabilities (process scanning, port harvesting), psutil is recommended but optional — the tool gracefully degrades without it. Works on Windows, macOS, and Linux.
Can it find LLMs that aren't currently running?
Yes. When scanning localhost, the deep-scan engine checks for installed platforms by examining config files, model directories, environment variables, and Docker containers — even if the service is stopped. These dormant installs are flagged because they may still contain model files and configuration worth reviewing.
Is this free for commercial use?
Yes. The tool is released under the MIT License, which permits commercial and private use, modification, and distribution. See the LICENSE file in the repository for details.

Need Help Securing Your AI Infrastructure?

Our team can help you lock down local LLM deployments, implement authentication, configure network isolation, and audit your AI supply chain.

Book a Free Consultation