VCF Health Check — Product Brochure

Your VCF Environment Deserves Better Than "It Seems Fine"

The Problem You Already Know About

You are running VMware Cloud Foundation. Your environment has vCenter, SDDC Manager, NSX, ESXi hosts, certificates, DNS records, datastores, vSAN clusters, and dozens of interconnected services — all of which need to be healthy for your infrastructure to function.

Right now, how do you know everything is healthy?

You check vCenter manually. You SSH into SDDC Manager. You open NSX and look around. You hope nothing slipped through the cracks. When something does break, you spend hours figuring out what went wrong, tracing it back through logs, and explaining to stakeholders why it happened.

A certificate expired silently. DNS stopped resolving for a management component. A vSAN disk group went degraded and nobody noticed. An SDDC Manager backup has not run in 47 days. A vCenter service crashed and is not running. ESXi hosts drifted out of compliance. Orphaned VM objects are consuming storage nobody can account for. Unclaimed disks are sitting idle while the team debates whether to buy more capacity.

These are not hypothetical problems. They are the problems that wake you up at 2 AM, cost your company thousands in downtime, and erode trust with the clients you serve.

What If You Could Check Everything In Under 5 Minutes

VCF Health Check connects to your entire VCF environment, runs over 130 individual checks across every major component, and gives you a single grade (A through F) with a detailed report — all in under five minutes.

One click. Under five minutes. Complete visibility.

No manual SSH sessions. No hopping between six management consoles. No spreadsheets of things to remember to check. VCF Health Check does it all automatically, every time, consistently, and tells you exactly what needs attention — with specific remediation steps for every issue found.

What Gets Checked

VCF Health Check is not a generic monitoring tool. It was built specifically for VMware Cloud Foundation environments and understands the relationships between components that matter. Every check was designed to catch a real problem that has caused a real outage somewhere.

Infrastructure and Network (14 Checks)

DNS forward resolution for every management component — vCenter, SDDC Manager, NSX VIP, and NSX Node. DNS reverse resolution (PTR record) consistency for the same four endpoints. SSL certificate expiry monitoring with configurable advance warnings (default: 30 days). Network latency tracking with TCP connect time measurement to vCenter, SDDC Manager, and NSX — reported in a color-coded latency table. ESXi host HTTPS reachability checks running in parallel for speed, plus per-host SSH port verification and coredump configuration validation. Clock drift detection compares time across all management endpoints using HTTP Date headers, warning at 5-second drift and failing at 30-second drift.

If a DNS record is wrong, you will know before it causes a deployment failure. If a certificate is expiring in 30 days, you will know before it causes an outage at 2 AM. If latency to a management endpoint is spiking, you will know before users start complaining.

vCenter Server (36 Checks)

The deepest coverage of any VCF health tool:

API and Identity: API reachability and session creation, version and build detection, VCF BOM compliance validation
System Health: Overall status (green/yellow/orange/red), CPU load, memory, storage, swap, database health — each as an independent check
Inventory: VM count, ESXi host connection states (CONNECTED, DISCONNECTED, NOT_RESPONDING)
Cluster Configuration: HA enabled/disabled per cluster, DRS configuration and mode per cluster
Storage: Datastore capacity per-datastore with configurable warning (80%) and critical (90%) thresholds, vSAN cluster overall health, vSAN group-level test results, vSAN resync and rebuild status, vSAN disk group health per physical disk, storage policy compliance count, content library sync status
New Checks: Orphaned VM object detection (orphaned, inaccessible, and invalid connection state VMs consuming storage), unclaimed vSAN-eligible disk identification (disks present but not claimed by any disk group), cluster CPU capacity threshold monitoring (configurable warn/critical percentages), cluster memory capacity threshold monitoring
Time and Security: NTP synchronization mode and server reachability, root password expiry (days remaining), license expiry and evaluation license detection
Services: Individual status checks for vpxd, STS, vpostgres, rhttpproxy, SCA, SPS, vsan-health, and vLCM
Operational Health: Active alarm count and severity, VM snapshot aging (configurable, default 72 hours), VCSA disk partition utilization (warn 80%, critical 90%), 24-hour event log error scan, ESXi hardware sensor health (parallelized), ESXi scratch partition check, DRS activity and migration tracking

42 checks. Every one of them matters. Every one of them has caused a real outage somewhere.

New in v9.0: vCenter alarm summary queries all active alarms via PropertyCollector, reporting critical and warning counts. ESXi BOM compliance validates each host build number against a customizable `vcf-bom-reference.json` reference file. Password expiry monitoring covers SSO accounts and ESXi root credentials via the SDDC credential store. vSAN resync and rebuild detection catches active resync operations that indicate data movement. Clock drift detection via HTTP Date header comparison identifies time synchronization issues. SDDC upgrade readiness queries the `/v1/upgradables` API to detect available updates.

SDDC Manager (19 Checks)

Connectivity: API reachability with retry and exponential backoff (automatic 5-second retry on first failure)
Status: Version detection, NSX and vCenter status from SDDC Manager's perspective, management component health per-component deployment status
Security: Certificate inventory with 60-day warning threshold, credential and password expiry monitoring
Backup: Backup age verification (configurable, default 48 hours), backup success/failure status
Lifecycle: System prechecks compliance, in-progress task count and stale task detection (stuck >24 hours), resource lock detection, host drift detection (ASSIGNED vs COMMISSIONED state), depot connectivity verification, lifecycle update availability
Auto-Remediation: Optional automatic cancellation of stale tasks blocking the lifecycle queue

New in v9.0: upgrade readiness checking via the upgradables API, enhanced backup age and size tracking with configurable thresholds, and credential store password expiry monitoring.

SDDC Manager is the control plane of your entire VCF deployment. If it is unhealthy, everything downstream is at risk. A single stuck task can block every lifecycle operation in the environment.

NSX Manager (13 Checks)

API reachability with VIP-first and individual node fallback, version detection, control cluster stability, management cluster stability, transport node state (connected count vs total), critical and warning alarm counts, admin password expiry, edge cluster member health per edge cluster, distributed firewall rule count with configurable threshold (default 500), certificate validation, and transport zone membership count.

Network topology validation verifies transport zone and node state health. Transport zone membership counts confirm proper host bindings.

NSX problems cascade fast. A failing transport node or a degraded control cluster can take down your entire network overlay in minutes. A DFW rule count exceeding 500 signals a rule sprawl problem that will impact performance and make troubleshooting a nightmare.

VCF Operations and ARIA Suite (4 Checks)

Suite API reachability and token acquisition, node online status per cluster member, collector health verification, and adapter data receiving verification (active adapters vs total).

Fleet and vRSLCM (2 Checks)

API reachability (through VCF Operations proxy or direct endpoint) and environment deployment status.

Custom Plugin Checks

Need to check something specific to your environment? Drop a shell script in the plugin directory. VCF Health Check auto-discovers plugins, runs them, parses their output, and integrates results directly into the grading, scoring, and reporting system. No code changes required.

The Report That Changes How You Work

After every health check, VCF Health Check generates comprehensive reports automatically. Not a wall of text. Not a raw log dump. Professionally designed, interactive reports that you can hand to your CTO, your client, or your operations team.

Interactive HTML Report

A single self-contained HTML file with no external dependencies. Open it in any browser. Everything is inline — CSS, JavaScript, SVG charts, and all data.

Visualization:

Donut SVG chart showing pass/warn/fail proportions at a glance
Per-component trend line charts with grade annotations and trend statistics card showing improvement trajectories
Score sparkline trend chart showing your health trajectory over runs
90-day heatmap calendar — a color-coded grid showing your daily health grade for the past three months
SVG dependency map showing component relationships with health-colored nodes and edges
SLA uptime tracker with per-component uptime percentages and mini timeline bars
Network latency table with color-coded performance badges (good/warn/slow)
Component health cards grid with grade, score, and pass/warn/fail counts per component

Navigation and Filtering:

Dark mode and light mode toggle (button or keyboard shortcut D)
Executive View (high-level summary for leadership) and Technical View (full details for engineers) — toggle with keyboard shortcut E
Pass, Warning, and Failure filter buttons to focus on what matters
Sidebar navigation with anchor links to every section
Collapsible cards for each section to reduce scrolling

Comparison:

Diff view to compare any two historical runs side by side and see exactly what changed, what improved, and what regressed

Actionable Output:

Remediation checklist with checkboxes per issue, estimated time-to-fix badges per item, grouped by component
Playbook export button (keyboard shortcut P) — downloads the entire remediation checklist as a text file ready to hand to an engineer
Copy executive summary button for pasting into emails or tickets
CSV export button for loading into spreadsheets
Print button with print-optimized CSS stylesheet

Usability:

Keyboard shortcuts panel for power users
Mobile-responsive CSS with breakpoints at 768px and 480px for checking reports on phones and tablets
Auto-remediation log section (when —fix was used)

All 10 Report Formats

Format	Generated	Description
Interactive HTML	Always	Self-contained file with all features above
JSON	Always	Machine-readable structured data for integrations
Plain Text	Always	Terminal-friendly with ASCII trend line
PDF	Auto (if Chrome/Edge available)	Print-ready from the HTML report
CSV	—csv flag	One row per check for spreadsheets
Markdown	—markdown flag	Tables grouped by component for wikis
Prometheus/OpenMetrics	—prometheus flag	Gauge metrics for Prometheus scraping
Ansible Inventory	—ansible flag	YAML groups with host variables
Multi-Environment Dashboard	—merge-reports flag	Consolidated grid of all environments
Config Backup	—backup-config flag	Topology and settings JSON (no credentials)

Plus: a JSON Schema file (draft 2020-12) generated alongside every JSON report for downstream validation.

Alerts Where You Already Work

VCF Health Check integrates with ten notification channels:

Channel	How It Works
Email (SMTP)	HTML report attached. Tries msmtp, sendmail, then curl SMTP. Grade and score in subject line.
Slack	Color-coded attachment block. Grade, score, failure list, environment name.
Microsoft Teams	Adaptive Card with fact rows for grade, score, failures, warnings, date.
PagerDuty	Events API v2. Auto-creates incidents on failures. Dedup key prevents duplicates. Auto-resolves when grade returns to A/B.
OpsGenie	Alerts API with priority mapping (F=P1 through A=P5). Tags include component names.
Syslog (RFC 5424)	UDP to configurable host:port. One message per failed check. Compatible with Splunk, ELK, and any SIEM.
Custom Webhook	HTTP POST with full JSON payload. Connect to ServiceNow, Grafana, Datadog, or anything with an API.
ServiceNow	The `—servicenow` flag auto-creates incidents from health check failures, mapping the health grade to ServiceNow priority levels. Configure with `SERVICENOW_INSTANCE`, `SERVICENOW_USER`, and `SERVICENOW_PASS` environment variables.
Jira	The `—jira` flag auto-creates Jira issues from failures. Configure with `JIRA_URL`, `JIRA_USER`, `JIRA_TOKEN`, and `JIRA_PROJECT` environment variables.
Email Digest	The `—email-report` flag sends the full HTML report as an email attachment with grade and score summary in the subject line.

Threshold-based alerting: Set a grade threshold (default: C) and only get notified when the grade drops to that level or below. No noise. No alert fatigue. Just the signal you need.

Severity mapping: PagerDuty and OpsGenie automatically map the health grade to severity levels. An F grade pages the on-call as P1/Critical. A C grade creates a warning. A/B grades resolve existing incidents automatically.

A Desktop Application Built for Operations Teams

VCF Health Check is a professional desktop application — not a script you run in a terminal. It is built with Python and Tkinter, runs on Windows, macOS, and Linux, and provides a complete graphical interface for every feature.

Splash Screen and First Impressions

The application launches with an animated splash screen — gradient background, floating particle effects, progress bar, and fade-in branding. It is a small detail, but it sets the tone: this is a professional tool, not a weekend project.

Dashboard

A single screen showing your current health grade as a large color-coded badge (A green through F red), numeric score percentage, component-by-component health cards in a grid layout, executive summary text, previous grade comparison, and a score trend chart with grade zone shading. If you manage multiple profiles, a multi-environment overview section shows every environment's grade at a glance.

Two quick-action buttons: "Run New Check" and "View Reports."

Environment Configuration

Nine collapsible form sections covering every configuration option:

Management Endpoints: vCenter, SDDC Manager, NSX VIP, NSX Node, VCF Operations, Fleet/vRSLCM
Credentials: SSO, NSX, VCF Operations, Fleet, ESXi — each with show/hide password toggle
ESXi Hosts: Space-separated list of host IPs/FQDNs
Timeouts: Per-component configurable timeouts (Infrastructure, vCenter, SDDC, NSX, Operations, Fleet)
Thresholds: Certificate warning days, datastore warning/critical %, task count, snapshot hours, backup hours, DFW rule count, cluster CPU warn/critical %, cluster memory warn/critical %
Notifications: SMTP settings, Slack webhook, Teams webhook, generic webhook, notify threshold grade
Incident Management: PagerDuty routing key, OpsGenie API key
Scoring Weights: Per-component weight multiplier for health score calculation
Customer Branding: Company name, logo file, contact email, environment label

Every field has a tooltip on hover. Input validation highlights errors (IP/FQDN format, email format, URL format, numeric ranges). Unsaved-changes detection warns before navigating away.

Run Options: Checkboxes for —fix, —cleanup-tasks, —diff, —csv, —markdown, —quiet, and —json-only.

Profile Management

Managing multiple VCF environments is first-class:

Save / Load: Named profiles stored with encrypted credentials
Clone: Duplicate a profile for a similar environment
Import .env / Export .env: Exchange configurations with the bash engine
Import JSON / Export JSON: Portable configuration backup
Reset to Defaults: One-click return to default values
Profile Dropdown: Switch between environments instantly from the sidebar

Operators get read-only access to profiles. Admins get full create/save/delete/clone/import/export.

Run Check

"Run Full Health Check" button to execute all checks
"Validate Only" button to test connectivity without running checks
"Stop" button to terminate a running check
Live terminal output with color-coded results (green=PASS, yellow=WARN, red=FAIL)
Elapsed timer updating every second
In-GUI Scheduling: Dropdown for Off / Every 30 Minutes / Every Hour / Every 4 Hours / Daily
After completion: one-click buttons to open TXT, HTML, JSON, CSV, and Markdown reports

Reports View

Scrollable list of all historical report sets. Each row shows the grade badge, numeric score, date/time, and pass/warn/fail/total counts. One-click buttons to open HTML, JSON, TXT, or PDF reports. Admin-only "Cleanup Old Reports" button deletes reports older than the retention threshold.

Run History and Trends

Statistics Card: Total runs, latest grade, trend arrow (improving/stable/declining), average score, best grade, worst grade
Score Trend Chart: Line graph with grade zone bands (A green / B yellow-green / C orange / D dark-orange / F red), data points, area fill, date labels
Grade Distribution: Count of runs at each grade level

Suppressions (Admin Only)

Manage known issues that should not count as failures. Each suppression rule has a regex pattern matched against check messages and a note field for the reason or ticket reference. Matched checks are recorded as SKIP and excluded from the failure count. Changes persist to known-issues.json.

User Management (Admin Only)

Add users with username, role (admin/operator), and initial password
Reset passwords for any user
Delete users (cannot delete the last admin)
User list shows username and role tag

Audit Log (Admin Only)

Terminal-style scrollable display of the last 500 audit entries. Every action is logged: login, logout, password changes, profile operations, health check runs, report access, user management, suppression changes, settings modifications, license activation.

Filter/search input, refresh button, clear log button, and CSV export for external analysis.

Settings (Admin Only)

Script Paths: Health check script and bash interpreter with browse and auto-detect buttons
Appearance: System, Light, or Dark theme (sidebar always stays dark)
Security: Password expiry days and session timeout minutes
License: License key input, activate button, status display (Active / Expired / Trial / Grace Period)
LDAP/AD: Server URL, Base DN, Bind credentials, Enable checkbox (only visible if ldap3 package installed)

Help View

Built-in 13-section guide covering every feature: Getting Started, Dashboard, Environment, Run Check, Reports, Run History, Suppressions, Users, Audit Log, Settings, Scheduling, Troubleshooting, and System Requirements. No external documentation needed.

About View

Application version, copyright, license status with expiry date, system information (Python version, platform, Tkinter version, paths, profile count, user count), and contact information. Accessible via sidebar navigation or F1 keyboard shortcut.

Security Built for Enterprise

Credential Encryption

All profile credentials encrypted at rest using Fernet symmetric encryption (from the cryptography Python package). A machine-local encryption key is generated at ~/.vcf-hc-key with chmod 600 permissions. Legacy base64-only encoding is automatically detected and upgraded to Fernet on the next profile save. Exported .env files are created with chmod 600.

Password Hashing

User passwords hashed with PBKDF2-HMAC-SHA256 using 310,000 iterations and a 16-byte random salt unique to each password. Legacy hash formats (SHA-256 unsalted, lower-iteration PBKDF2) are automatically detected and upgraded after successful login.

Role-Based Access Control

Two roles: admin and operator. Admins have full access to all views and operations. Operators can run checks and view reports but cannot modify settings, manage users, edit suppressions, or delete profiles.

Authentication and Session Security

Login required on application start
Session timeout with automatic lock and re-authentication dialog (configurable minutes)
Brute-force lockout: 5 failed attempts triggers a 15-minute account lock with timestamp tracking
Password expiry with configurable days and login-time warnings
Change Password available to any logged-in user
Optional LDAP/Active Directory authentication for enterprise environments

Audit Trail

Every action logged to vcf-health-audit.log with ISO timestamp, username, role, action category, and detail. Exportable to CSV for compliance reporting. Admin-only access.

Shell Script Security

Lock file prevents concurrent execution. All temporary files written to mktemp-based directories. Credentials passed via environment variables, never as command-line arguments. SSL verification configurable with custom CA bundle support.

The Grading System

VCF Health Check does not give you a wall of data. It gives you a single letter grade that instantly communicates the state of your environment to anyone.

Grade	Criteria	What It Means
A	Score >= 95%, zero failures, zero warnings	Everything is healthy. No action needed.
B+	Score >= 90%, zero failures; or >= 85% with <= 1 failure	Minor issues only. Review at your convenience.
B	Zero failures at any score; or >= 80% with <= 3 failures	Warnings present but no critical problems. Plan remediation.
C	Score >= 70% with <= 5 failures	Multiple issues need attention this week.
D	Score >= 50%	Significant problems. Address immediately.
F	Score < 50%	Critical state. Environment at risk of outage.

Weighted Scoring: Each component carries a configurable weight multiplier. By default, vCenter, SDDC Manager, and NSX are weighted 2x (because failures in these components cascade). Infrastructure, VCF Operations, and Fleet are weighted 1x. Adjust weights to match your environment's priorities.

Formula: Sum of (component_pass_checks component_weight) divided by sum of (component_total_checks component_weight).

Automation That Runs While You Sleep

OS-Level Scheduling

Platform	How	One Command
Windows	Task Scheduler	`—schedule` generates XML and registers the task
Linux (cron)	crontab	`—cron daily` generates the cron entry
Linux (systemd)	Timer + Service	`—cron` generates both unit files
macOS	crontab	`—cron` generates the entry

In-GUI Scheduling

The Run Check view includes a scheduling dropdown: Off, Every 30 Minutes, Every Hour, Every 4 Hours, or Daily. Select a frequency and the application runs health checks automatically in the background while the GUI stays open.

Auto-Remediation

Some problems have obvious, safe fixes. Enable the —fix flag and VCF Health Check will:

Cancel stale SDDC Manager tasks blocking the lifecycle queue
Apply safe configuration corrections for known drift conditions
Acknowledge stale vCenter alarms older than 72 hours
Clean up failed vCenter tasks older than 7 days
Restart stopped NSX services
Report a summary of all fix actions taken

For everything else, the remediation playbook tells you exactly what to do — step by step, with estimated time to fix each issue, grouped by component, with checkboxes you can tick off as you work through them.

Built for MSPs and Service Providers

If you manage VCF environments for multiple clients, VCF Health Check was designed with you in mind.

Branded Reports

Put your company logo and name on every health check report. Configure company name, logo file, and contact email per profile. Your clients see your brand, not ours. Every report — HTML, PDF, TXT — carries your branding.

Client Tracking

Organize environments by client using the optional client management module. Track usage per client. Generate billing reports per client. Know exactly how much monitoring work you are doing for each customer.

Usage Analytics and Billing

The optional usage tracker module shows run counts per client and environment, with CSV, JSON, and billing summary exports. Built-in data for invoicing without manual tracking.

White Label

At the Enterprise tier, completely rebrand VCF Health Check as your own product. Your name, your branding, your product — powered by our engine.

API Export

Pull health check data programmatically into your own dashboards. The JSON report, Prometheus metrics, and webhook integrations let you feed data into ServiceNow, Grafana, Datadog, or any system your operations team uses.

Multi-Environment Visibility

Managing production, staging, and development environments? Monitoring VCF deployments across multiple data centers or client sites?

Multi-Environment Dashboard

The —merge-reports flag takes a directory of JSON reports from multiple environments and generates a single dark-themed HTML dashboard. Each environment appears as a card showing: environment name, grade badge, numeric score, last run date, and pass/warn/fail counts. One page, every environment, instant visibility.

In-GUI Multi-Environment Overview

When you have more than one profile configured, the Dashboard view automatically shows a multi-environment section with cards for every profile — grade, score, and last run date. No extra configuration needed.

Configuration That Fits Your Environment

Configurable Thresholds

Every threshold is adjustable. Nothing is hardcoded:

Threshold	Default	What It Controls
Certificate Warning	30 days	How far in advance to warn about expiring certificates
Datastore Warning	80%	Datastore capacity warning level
Datastore Critical	90%	Datastore capacity critical level
Cluster CPU Warning	70%	Cluster CPU utilization warning
Cluster CPU Critical	85%	Cluster CPU utilization critical
Cluster Memory Warning	70%	Cluster memory utilization warning
Cluster Memory Critical	85%	Cluster memory utilization critical
Snapshot Warning	72 hours	VM snapshot age before warning
Backup Warning	48 hours	SDDC Manager backup age before warning
DFW Rule Warning	500 rules	NSX firewall rule count before warning
Task Warning	5 tasks	In-progress SDDC task count before warning
Report Retention	30 days	How long to keep historical reports

Scoring Weights

Component	Default Weight	Why
Infrastructure	1x	Foundation layer — DNS, certs, network
vCenter	2x	Core compute management — failures cascade
SDDC Manager	2x	Lifecycle control plane — failures block operations
NSX	2x	Network overlay — failures isolate workloads
VCF Operations	1x	Monitoring — failure reduces visibility but not operations
Fleet	1x	Aria lifecycle — failure blocks updates but not operations

Adjust weights to match your priorities. A financial services company might weight NSX at 3x because microsegmentation is compliance-critical. An MSP might weight vCenter at 3x because it is the primary client-facing component.

Expected Down Components

If a component is intentionally offline (maintenance, decommissioning, not yet deployed), use the —known-down flag or the EXPECTED_DOWN configuration. Checks against that component are suppressed so they do not drag down the grade.

CLI Power

Flag	What It Does
`—only COMPONENT`	Run checks for one component only
`—skip COMPONENT`	Skip one component
`—known-down COMPONENT`	Suppress failures for an expected-down component
`—env FILE`	Load configuration from a .env file
`—fix`	Auto-remediate safe issues
`—cleanup-tasks`	Cancel stale SDDC Manager tasks
`—diff`	Compare current run to previous run
`—validate`	Test connectivity only, no checks
`—json-only`	Skip HTML/TXT, generate JSON only
`—csv`	Generate CSV report
`—markdown`	Generate Markdown report
`—prometheus`	Emit Prometheus metrics to stdout
`—ansible`	Generate Ansible inventory YAML
`—archive`	Compress report set into tar.gz
`—syslog HOST:PORT`	Override syslog destination
`—backup-config`	Export topology and settings JSON
`—merge-reports DIR`	Merge reports into multi-environment dashboard
`—quiet`	Suppress terminal output
`—schedule`	Register Windows Task Scheduler task
`—cron [daily/hourly]`	Generate cron entry and systemd timer
`—servicenow`	Create ServiceNow incidents from failures
`—jira`	Create Jira issues from failures
`—email-report`	Send HTML report as email attachment

Architecture

VCF Health Check is a desktop application that runs entirely on your infrastructure. No cloud. No agents. No SaaS. It uses the existing management APIs that are already exposed by every VCF component.

How It Works

1. You configure your environment endpoints and credentials in the GUI or a .env file

2. You click "Run Full Health Check" (or the scheduler triggers it)

3. The bash engine authenticates against every component API and runs all checks

4. Infrastructure checks run sequentially. vCenter checks run sequentially (API session reuse). SDDC Manager, NSX, VCF Operations, and Fleet checks run simultaneously in parallel background subshells. ESXi per-host checks are parallelized with background process tracking. Python ESXi sensor checks use ThreadPoolExecutor with 8 concurrent workers.

5. The Python report generator reads all results and produces HTML, JSON, and text reports

6. Notifications fire based on the grade threshold

7. Results are stored for historical trending

Codebase

File	Lines	Language	Role
vcf-health-check-gui.py	6,389	Python / Tkinter	Desktop GUI application
vcf-health-check.sh	3,757	Bash	Health check engine
vcf_checks.py	1,053	Python	Report generator and vCenter API checks
Total	12,000+

267 automated tests across two test suites validate every feature.

System Requirements

Requirement	Detail
Python	3.8 or later
Bash	4.0 or later (Git Bash on Windows)
curl	For all API calls
OpenSSL	For certificate expiry checks
nslookup	For DNS checks
Network	HTTPS (443) to all VCF management endpoints
Optional: cryptography	Python package for credential encryption
Optional: ldap3	Python package for LDAP/AD authentication
Optional: Chrome/Edge	For PDF report generation

Runs on Windows 10/11, macOS, and any Linux distribution. No installation required — run the files directly. No cloud dependency. All data stays on your infrastructure.

What Makes This Different

Built exclusively for VMware Cloud Foundation. Not adapted from a generic monitoring platform. Not a plugin for a larger tool. Every one of the 150+ checks was designed specifically for VCF components and the relationships between them. Orphaned VM detection. Unclaimed vSAN disk identification. Cluster capacity thresholds. SDDC Manager stale task detection. BOM compliance. These are VCF problems that generic tools do not know to check.

Runs in under 5 minutes. A complete health audit of your entire VCF environment in under five minutes. Parallel execution across components means five things are being checked simultaneously.

Gives you a grade. Not a wall of data. A single letter grade (A through F) that instantly communicates the state of your environment to anyone — from the engineer who needs to fix things to the executive who needs to report on them.

Tells you how to fix things. Not just what is wrong. Every failure and warning comes with specific remediation steps, estimated time to resolve, and a downloadable playbook checklist you can hand to an engineer and say "fix these."

Works offline. No cloud dependency. No data leaving your network. No SaaS subscription that requires internet access. Install it, run it, keep your data on your infrastructure.

Generates reports you can actually use. Ten report formats. Hand the HTML to a client. Feed the JSON to ServiceNow. Scrape the Prometheus metrics with Grafana. Generate an Ansible inventory from your live environment. Export a remediation playbook. Print the PDF and pin it to the wall. The reports are designed to be useful, not just complete.

Enterprise security from day one. Fernet-encrypted credentials, PBKDF2 password hashing with 310,000 iterations, role-based access control, brute-force lockout, session timeouts, full audit trail, optional LDAP integration. This is not a script with passwords in a text file.

Pricing

VCF Health Check is priced per environment per month. The more environments you monitor, the lower your per-environment cost. Annual billing saves 15%.

One-time $500 onboarding fee includes your setup call, configuration assistance, and branded report setup.

Per-Environment Rates

Environments	Standard	Professional	Enterprise
1-9	$299/env/mo	$399/env/mo	$499/env/mo
10-24	$249/env/mo	$349/env/mo	$449/env/mo
25-99	$199/env/mo	$299/env/mo	$399/env/mo
100+	$149/env/mo	$249/env/mo	$349/env/mo

Standard Edition

All 150+ health checks. All 10 report formats. All 10 notification channels. Desktop GUI application with encrypted credential storage, role-based access, and audit trail. Everything you need to monitor your VCF environments professionally.

Professional Edition

Everything in Standard, plus branded reports with your company logo, automated scheduling (Windows Task Scheduler, cron, systemd), and multi-environment dashboard. This is the edition for service providers who want to look professional and save time.

Enterprise Edition

Everything in Professional, plus API export for dashboard integrations (ServiceNow, Grafana, Datadog), full white-label capability to rebrand the product as your own, usage analytics and billing module, and priority support. This is the edition for organizations that want maximum flexibility and zero limitations.

Example Monthly Costs

A solo consultant monitoring 3 environments on Standard pays $897 per month. An MSP monitoring 25 environments on Professional pays $7,475 per month. A large partner monitoring 50 environments on Enterprise pays $19,950 per month. Volume pricing for 100+ environments is available on request.

Partner Program

VCF Health Check is available through channel partnerships for VMware/Broadcom solution providers, MSPs, and distributors. Whether you resell VCF solutions, manage client environments, or distribute VMware technology — we have a partnership model designed for your business.

Channel Partner Benefits

Branded Reports — deliver health check reports with your company logo and name on every page
White-Label Capability — rebrand VCF Health Check as your own product (Enterprise tier)
Per-Client Tracking — usage analytics with billing summary exports for each of your customers
Multi-Environment Dashboard — consolidated health view across all client environments
Scheduling & Automation — automated health checks on your schedule with zero manual intervention

Distribution Partnerships

We offer distribution agreements with competitive reseller margins for technology distributors. One partnership provides access to dozens of downstream partner customers through your existing reseller network. White-label Enterprise licensing enables your partners to rebrand and resell under their own brand.

Become a Partner

We are actively onboarding VMware/Broadcom partners, MSPs, and distributors. To explore a partnership:

Email: [mhayes@virtualcontrolllc.com](mailto:mhayes@virtualcontrolllc.com)
Request a Partner Demo — see branded reports, white-label capability, and multi-environment dashboards in action
Custom Pricing — volume and distribution pricing available for qualified partners

Get Started

Contact Virtual Control LLC to discuss your VCF monitoring needs and get your license key.

Every new installation includes a Setup Wizard that walks you through activation in under two minutes. Paste your key, confirm your paths, and you are running health checks immediately.

No lengthy onboarding. No professional services engagement. No training required. If you can manage a VCF environment, you can use VCF Health Check.

Your VCF environments are too important to monitor with hope and manual spot checks. Give your team — and your clients — the confidence that comes from knowing everything is healthy, every single day.

VCF Health Check is a product of Virtual Control LLC.

VMware and VMware Cloud Foundation are trademarks of Broadcom Inc.