20 Python diagnostic scripts for VCF 9.0.1 nested lab troubleshooting. All use Paramiko for SSH and run from a Windows workstation.
pip install paramiko
python <script-name>.py
| Target | IP | User | Purpose |
|---|---|---|---|
| SDDC Manager | 192.168.1.241 | vcf | API gateway, database access (su to root) |
| NSX Node | 192.168.1.71 | root | Direct NSX service management |
| NSX VIP | 192.168.1.70 | admin | NSX cluster API (via curl from SDDC Mgr) |
Note: All scripts use the standard lab password. If
vcfis locked out:faillock --user vcf --resetfrom SDDC Manager console as root.
| Scenario | Script |
|---|---|
| Is everything healthy? | python quick_status.py |
| NSX slow after boot? | python nsx_monitor.py |
| Credential operation failed? | python check_remediate_error.py |
| Need to update NSX password? | python nsx_cred_update.py |
| NSX CPU overloaded? | python nsx_slim.py |
| Put NSX services back? | python nsx_restart_all.py |
| Clear stale DB locks? | python clear_locks.py |
| Fix stuck tasks in DB? | python fix_stuck_tasks.py |
| Full cascade fix? | python full_remediate_fix.py |
| System clean after fix? | python final_check.py |
| Script | Connects To | What It Does |
|---|---|---|
quick_status.py |
SDDC Manager | Start here. NSX status, VIP health, resource locks, notifications, credentials |
final_check.py |
SDDC Manager | Lightweight: notifications and resource locks only |
diag.py |
localhost | DNS resolution, TCP 443 connectivity, ARP/routing from Windows host |
nsx_monitor.py |
NSX Node | Polls cluster status + load avg every 60s for 10 iterations |
| Script | Connects To | What It Does |
|---|---|---|
nsx_check.py |
SDDC Manager | Tests both NSX VIP and direct node connectivity — diagnoses VIP failover issues |
nsx_diag.py |
NSX Node | Top CPU consumers, disk space, service health via API, catalina errors |
nsx_resource_check.py |
SDDC Manager | NSX clusters, credentials, warnings, DB resource state |
sddc_nsx_status.py |
SDDC Manager | Compares SDDC Manager's NSX status vs actual NSX VIP cluster status |
python quick_status.py # 1. Overall health
python nsx_check.py # 2. VIP + node connectivity
python nsx_diag.py # 3. Performance & services
python sddc_nsx_status.py # 4. SDDC Manager vs NSX sync
| Script | Modifies | What It Does |
|---|---|---|
nsx_cred_update.py |
Yes | Full workflow: health checks, lists credentials, updates admin API, monitors 200s |
nsx_retry_when_ready.py |
Yes | Waits up to 15 min for NSX API, then submits update with 450s monitoring |
check_disconnected.py |
No | Inspects all credential objects for connection status fields |
check_remediate_error.py |
No | Failed task details with full error messages, NSX connectivity test, log search |
WARNING: Do not run credential update scripts if NSX status is not ACTIVE in SDDC Manager or STABLE at the VIP. A failed update creates stale locks and stuck tasks, requiring database repair.
| Script | Action | What It Does |
|---|---|---|
nsx_slim.py |
Stops | Stops 5 non-essential services to free CPU during boot storm |
nsx_restart_all.py |
Starts | Restarts all services stopped by nsx_slim.py |
nsx_fix_svc.py |
Restarts | Restarts search, nsx-sha, nsx-appl-proxy, validates health |
python nsx_slim.py # Free CPU (if load > 30)
# Wait for load to drop below 15
python nsx_restart_all.py # Bring services back
python nsx_check.py # Verify cluster health
Scripts that modify SDDC Manager's PostgreSQL database. All use the trust auth workaround (backup pg_hba.conf, set trust, fix, restore).
| Script | What It Does |
|---|---|
clear_locks.py |
Fixes NSX status (ACTIVATING/ERROR -> ACTIVE), clears lock table, restarts operationsmanager |
fix_stuck_tasks.py |
Marks stuck task_metadata as resolved, clears task_lock, fixes execution_to_task orphans |
full_remediate_fix.py |
Complete cascade fix: NSX health check + DB fix (status + locks + tasks) + service restart |
find_pg_pass.py |
Searches for PostgreSQL password in config files (read-only) |
get_task.py |
Retrieves task details by ID with subtask errors (edit task_id before running) |
Always use
PAGER=catwhen running psql to prevent pager traps in remote sessions.
-- Connect: su - postgres -c "PAGER=cat psql -h 127.0.0.1 -d platform"
-- Fix NSX status (covers ACTIVATING and ERROR)
UPDATE nsxt SET status = 'ACTIVE' WHERE status != 'ACTIVE';
-- Clear stale locks
DELETE FROM lock;
-- Resolve stuck tasks
UPDATE task_metadata SET resolved = true WHERE resolved = false;
DELETE FROM task_lock;
Full procedure: Troubleshooting Handbook Section 10
Problem: Credential operation failed
|
+-- python quick_status.py
| |
| +-- NSX Status = ACTIVATING or ERROR?
| | +-- python clear_locks.py (fix DB status + locks)
| | +-- python fix_stuck_tasks.py (resolve stuck tasks)
| | +-- OR: python full_remediate_fix.py (all-in-one)
| |
| +-- NSX VIP returning 503?
| | +-- python nsx_diag.py (check load)
| | +-- Load > 30? -> python nsx_slim.py (free CPU)
| | +-- Wait -> python nsx_monitor.py (track recovery)
| |
| +-- All green?
| +-- python nsx_cred_update.py (retry update)
Document Information
| Field | Value |
|---|---|
| Document Title | VCF 9.0.1 Diagnostic Scripts Cheat Sheet |
| Version | 2.0 |
| Last Updated | February 2026 |
| Script Count | 20 scripts |
| Prerequisite | Python 3.x + Paramiko (pip install paramiko) |
| Environment | Dell Precision 7920, VMware Workstation Nested Lab |
Scripts are for lab/educational purposes. Verify target hosts before running. Database-modifying scripts back up and restore pg_hba.conf automatically.
(c) 2026 Virtual Control LLC. All rights reserved.