How TraceMint Works

01

The Analysis Pipeline

Unlike traditional scanners that rely on regex patterns, TraceMint uses a multi-stage pipeline that combines static analysis with semantic understanding. Each stage progressively refines candidates from thousands of pattern matches down to verified vulnerabilities.

Analysis Mode Choose depth vs speed tradeoff

Parse

AST extraction via tree-sitter across 30+ languages

source → AST → symbol table

Generate

Pattern + Taint + Route generators produce candidates

1800+ patterns × taint flows

Verify

Category-specific verifiers check guards & sanitizers

guard dominance + binding proof

Verdict

Evidence-backed findings with proof chains

VULN | NEEDS_REVIEW | SAFE

02

Language-Agnostic AST Parsing

We use tree-sitter for high-fidelity AST extraction, enabling precise semantic analysis across 30+ programming languages. Each language has a dedicated taint engine that understands framework-specific idioms.

AST Node Extraction & Taint Propagation ▼

# Source Code (PHP - Laravel)
public function show(Request $request) {
    $orderId = $request->input('order_id');
    $order = Order::find($orderId);
    return response()->json($order);
}

# Extracted AST with Taint Labels
FunctionDecl: show
  └─ Parameter: $request [TAINT_SOURCE: HTTP_REQUEST]

Assignment: $orderId
  └─ MethodCall: $request->input('order_id')
      └─ [USER_CONTROLLED: query_param]

MethodCall: Order::find($orderId)
  └─ Argument: $orderId [TAINT_SINK: DB_LOOKUP]
  └─ [POTENTIAL_IDOR: No ownership check before sink]

Return: response()->json($order)
  └─ [DATA_EXPOSURE: Full object returned]

Codebase Indexing Output

Before analysis begins, we build a complete map of your application's structure.

🗺️ Route Map

GET  /api/orders/{id}     → OrderController@show
POST /api/orders          → OrderController@store
GET  /api/users/{id}      → UserController@show
PUT  /api/users/{id}      → UserController@update
DELETE /api/admin/users   → AdminController@delete

🔗 Call Graph

OrderController@show
  ├─→ OrderService::getOrder()
  │     └─→ Order::find()  [SINK]
  └─→ response()->json()

UserController@update
  ├─→ $this->authorize()  [GUARD]
  └─→ User::update()

📊 Symbol Index

Classes:      127
Functions:    843
Routes:        47
Middlewares:   12
Models:        23
─────────────────
Taint Sources: 89
Potential Sinks: 156

03

Cross-File Interprocedural Taint Tracking

Our interprocedural analysis follows data flow across function boundaries, classes, and even different files to find vulnerabilities that traditional scanners miss. We track taint through 3+ levels of function calls.

routes/api.php Entry Point

Route::get('/order/{id}', [OrderController::class, 'show']);

$id propagates to controller

Controllers/OrderController.php Controller

public function show($id) {
    return $this->orderService->getOrder($id);
}

Crosses class boundary to service

Services/OrderService.php Sink Location

public function getOrder($orderId) {
    return Order::find($orderId);  // IDOR: No ownership check!
}

Taint Flow Visualization

Watch how tainted data flows through your application architecture.

🔗

Inter-procedural Analysis

Tracks taint through function calls with configurable depth (default: 3 levels). Creates function summaries for reuse.

📁

Cross-File Resolution

Resolves imports, inheritance, and namespace across entire codebase. Handles dependency injection patterns.

🏷️

Field-Sensitive Tracking

Tracks taint in object fields independently: $user->id vs $user->name

04

CFG-Based Guard Verification

Finding a guard isn't enough. We verify that the guard actually protects the vulnerable sink through Control Flow Graph (CFG) dominance analysis. A guard must dominate the sink path to be effective.

🔐

AUTH_CHECK

User authentication

Auth::check(), isLoggedIn()

🛡️

AUTHZ_CHECK

Object-level authorization

$this->authorize(), Gate::allows()

👤

OWNERSHIP

Resource ownership binding

$order->user_id == Auth::id()

✓

VALIDATION

Input sanitization

filter_var(), is_numeric(), htmlspecialchars()

Guard Dominance Analysis

A guard must dominate the sink in the control flow graph. We verify guards don't have bypass patterns.

❌ Guard doesn't dominate - VULNERABLE

if ($isAdmin) {
    log("admin access");
}
// Guard in different branch!
// Sink is NOT protected
$order = Order::find($id);
return $order;

✓ Guard dominates sink - SAFE

$order = Order::find($id);
if ($order->user_id != Auth::id()) {
    abort(403);  // Throws exception
}
// Guard dominates return
return $order;

Proof Obligations: How We Verify

We don't reduce false positives with filters. We reduce them with proof obligations. Every finding must satisfy a formal obligation checklist before getting a verdict.

🔍

ACCESS

Resource primitive detected (fetch/update/delete via ORM, raw query, file op)

Required

🔗

BINDING

ID parameter bound to authorization context (user_id, tenant_id, session)

Check

🛡️

DOMINANCE

Guard dominates sink in CFG (no bypass path exists)

Check

⚡

EFFECT

Security-relevant impact (data exposure, state mutation, privilege escalation)

Required

VULN = ACCESS ∧ ¬BINDING ∧ ¬DOMINANCE ∧ EFFECT All obligations must be proven or disproven. Uncertainty → NEEDS_REVIEW

Verdict Outcomes

VULN

All conditions must be true:

Taint source reaches sink (verified path)
No sanitizer on the path
No guard dominates the sink in CFG
Effect is security-relevant (data exposure, state change)

→ Report with full evidence chain

NEEDS_REVIEW

Uncertainty in analysis:

Guard exists but can't verify effectiveness
Custom sanitizer detected but not in our KB
Cross-file resolution incomplete
Dynamic dispatch blocks static analysis

→ Flag for manual review with context

SAFE

Protection verified:

Guard dominates sink AND binds to user identity
Sanitizer verified for this vulnerability class
Input constrained (enum, hardcoded, internal-only)
Framework provides implicit protection

→ Suppress with documented reason

05

5-Stage False Positive Reduction

Our multi-stage filtering system eliminates false positives while preserving real vulnerabilities. Each stage applies progressively more sophisticated analysis to reduce noise.

1

Static Context Filtering

Test file detection
Comment/string literal exclusion
Dead code path removal
Example/demo file detection

~20% filtered

2

Framework-Aware Analysis

Framework-safe patterns (ORM parameterization)
Built-in validation detection
Internal/admin-only routes
Config file exclusion

~15% filtered

3

AST Sanitizer Detection

Category-specific sanitizers
Custom validator recognition
Type coercion analysis
Encoding function detection

~25% filtered

4

Taint-Aware Reachability

Cross-file taint verification
Function summary utilization
Alias & field tracking
Conditional taint flow

~20% filtered

5

AI-Assisted Semantic Reasoning (Proof-Gated)

Context-aware code reasoning
Business logic understanding
Custom guard pattern recognition
Implicit protection detection

~15% filtered

Verdict Levels — Not Percentages

We don't claim arbitrary FP reduction numbers. Instead, every finding gets a clear verdict level based on proof obligations:

VERIFIED Docker PoC executed successfully — exploit confirmed

PROOF-BACKED ACCESS + BINDING + DOMINANCE + EFFECT chain complete

NEEDS REVIEW Strong signal but incomplete proof — human review recommended

06

AI-Assisted Semantic Layer (Proof-Gated)

Our local 32B parameter model accelerates analysis, but never makes final decisions alone. Every AI suggestion must pass through our deterministic proof kernel before becoming a verdict. The AI assists — the proof engine decides.

⚖️

Trust Model: What's Deterministic vs AI?

Always On

Deterministic Core

AST / CFG / DFG analysis
Cross-file taint tracking
Guard dominance verification
Proof obligation checks
Verdict engine logic

Optional

AI-Assisted

Candidate expansion
Finding ranking
Patch suggestions
Human-readable explanations
Business logic hints

Private, In-House Fine-Tuned Model

Fine-tuned in-house on curated vulnerability data for localization and ranking. No external LLM API calls.

🔒 30K+ Training Examples

📝 25+ Vuln Categories

🎯 80.4% Strict CVE Recall

🔄 Active Development

Semantic Code Understanding

Understands what code does, not just what it looks like. Recognizes custom validators, business logic guards, and framework idioms that pattern matching cannot identify.

Context-Aware Reasoning

Analyzes surrounding code context to determine if a pattern is actually vulnerable or if there's implicit protection. Understands auth middleware, role checks, and ownership patterns.

Continuous Learning

The model is continuously updated with new vulnerability patterns from our ongoing security research. Every CVE we discover improves detection for the next scan.

07

Auto-Generated Proof of Concept

PoC is generated automatically and replayed in a local Docker lab if available. No more spending hours crafting exploit payloads. TraceMint generates them based on the detected vulnerability pattern and your application's API structure.

OrderController.php CRITICAL

42 $orderId = $request->input('order_id');
43
44 // Missing: ownership check!
45 // Should be: if ($order->user_id != Auth::id())
46
47 $order = Order::find($orderId);
48 return response()->json($order);

Proof Obligations

ACCESS: Proven ✓

Resource primitive detected: Order::find() performs database lookup via Eloquent ORM.

BINDING: Failed ✗

No binding found between $orderId and authenticated user context (Auth::id()).

DOMINANCE: Failed ✗

No guard dominates the sink. Auth middleware exists but doesn't check resource ownership.

EFFECT: Proven ✓

DATA_EXPOSURE: Full order object returned including PII fields.

Verdict: VULN

⚡

One-Click Simulation

Copy-paste ready PoC commands. Test vulnerabilities immediately without manual payload crafting.

🎯

Context-Aware Payloads

PoCs are generated based on your app's actual routes, parameters, and authentication mechanisms.

📋

Report-Ready Evidence

Every finding includes full taint chain, guard analysis, and reproduction steps for security reports.

🐳

Local Docker Replay

If docker-compose or Dockerfile exists in the repo, PoC runs automatically in an isolated lab and marks the finding as Verified.

"We don't just flag. We prove — and if Docker is available, we reproduce locally."

08

Real-World Results

TraceMint has been battle-tested against 30+ open-source projects, discovering and responsibly disclosing critical vulnerabilities. These are real CVEs, not synthetic benchmarks.

30+ OSS Projects Audited

50+ Vulnerabilities Reported

High Recall on Known CVEs

15+ Critical Severity

EXTENSIBILITY

Built for Customization

TraceMint's modular architecture lets you add new languages, frameworks, and detection rules without modifying core analysis logic.

Core IR + Proof Kernel

Language-agnostic intermediate representation. Proof obligations, CFG analysis, verdict engine.

Stable API

↕

Language Front-ends

PythonPHPJavaScriptJavaGoRuby+10 more

Framework Adapters

DjangoLaravelExpressSpringFastAPIRails+20 more

Plugin System

↕

Vuln Category Rules

IDORSQLiSSRFXSSRCECustom...

Output Formats

SARIFJSONHTMLMarkdownCustom...

YAML Config

Extension Points

🔧

Custom Rules

Define new vulnerability patterns in YAML. Specify sources, sinks, sanitizers, and proof requirements.

rules/custom/my_pattern.yaml

🌐

New Frameworks

Add framework adapters that teach TraceMint about routes, middleware, and built-in protections.

adapters/my_framework.py

🗣️

New Languages

Implement a tree-sitter-based parser and taint engine. The core analysis remains unchanged.

engines/my_lang_taint.py

📤

Custom Reporters

Export findings in any format. Built-in support for SARIF, but easily extensible to JIRA, Slack, etc.

reporters/my_output.py

ADVANTAGE

Four Pillars That Set Us Apart

Competitors promise "AI agents" and "zero false positives." We deliver something more defensible: a system you can verify, trust, and deploy on your terms — self-hosted or managed SaaS.

01

🔒

Data-Control First

Your code, your deployment choice

Self-hosted or managed SaaS options
No third-party LLM API calls
Air-gapped deployment supported
SOC2, FedRAMP, GDPR compatible

vs. Third-Party LLMs They send code to external APIs. We keep it in your control.

02

📋

Proof-First

Every finding ships with evidence

4 proof obligations: ACCESS, BINDING, DOMINANCE, EFFECT
Complete source→sink taint chain
Guard analysis with CFG dominance
Exportable evidence for compliance

vs. Pattern Matchers They say "possible IDOR." We prove why it's exploitable.

03

🔬

Verification-Ready

We eliminate FPs—not you

5-stage FP reduction pipeline
Auto-generated PoC for each finding
Semantic guard verification
Patch-pair testing prevents regressions

vs. Alert Fatigue Tools They dump 500 alerts. We surface 7 verified vulns.

04

⚡

Mode-Based

Fast for CI, Deep for audits

FAST: Pattern + AST for PR checks
BALANCED: Proof kernel + taint tracking
DEEP: Full LLM verification + PoC
Same engine, configurable depth

vs. One-Size-Fits-All They run the same scan everywhere. We adapt to context.

The Difference in Action

Pattern Matcher

"Found User::find($id) - possible IDOR"

You investigate. You write the PoC. You decide if it's real.

TraceMint

IDOR: User::find() at line 47
✗ BINDING: No ownership check against Auth::id()
✗ DOMINANCE: Guard at line 12 doesn't protect sink
→ EFFECT: DATA_EXPOSURE (email, address, phone)

Proof chain complete. PoC generated. Ready to fix or report.