Commit Graph

182 Commits

Author SHA1 Message Date
henrygd
e527534016 ensure deprecated system fields are migrated to newer structures
also removes refs to legacy load avg fields (l1, l5, l15) that were
around for a very short period
2026-03-10 18:46:57 -04:00
VACInc
963fce5a33 agent: mark mdraid rebuild as warning, not failed (#1797) 2026-03-09 17:54:53 -04:00
Sven van Ginkel
d38c0da06d fix: bypass NIC auto-filter when interface is explicitly whitelisted via NICS (#1805)
Co-authored-by: henrygd <hank@henrygd.me>
2026-03-09 17:47:59 -04:00
henrygd
6b1ff264f2 gpu(amd): add workaround for misreported sysfs filesize (#1799) 2026-03-09 14:53:52 -04:00
henrygd
5e1b028130 refactor(smart): improve perf by skipping ata_device_statistics parsing if unnecessary 2026-03-08 15:19:50 -04:00
henrygd
638e7dc12a fix(smart): handle negative ATA device statistics values (#1791) 2026-03-08 13:34:16 -04:00
henrygd
73c262455d refactor(agent): move GetEnv to utils package 2026-03-07 14:12:17 -05:00
henrygd
0c4d2edd45 refactor(agent): add utils package; rm utils.go and fs_utils.go 2026-03-07 13:50:49 -05:00
henrygd
8f23fff1c9 refactor: mdraid comments and organization
also hide serial / firmware in smart details if empty, remove a few
unnecessary ops, and add a few more passed state values
2026-02-27 14:23:10 -05:00
VACInc
02c1a0c13d Add Linux mdraid health monitoring (#1750) 2026-02-27 13:42:47 -05:00
henrygd
69fdcb36ab support ZFS ARC on freebsd 2026-02-26 18:38:54 -05:00
henrygd
b91eb6de40 improve root I/O device detection and fallback (#1772)
- Match FILESYSTEM directly against I/O devices if partition lookup
fails
- Fall back to the most active I/O device if no root device is detected
- Add WARN logs in final fallback case to most active device
2026-02-26 18:11:33 -05:00
henrygd
ec69f6c6e0 improve disk I/O device matching for partition-to-disk mismatches (#1772)
findIoDevice now normalizes device names and falls back to prefix-based
matching when partition names differ from IOCounter names (e.g. nda0p2 →
nda0 on FreeBSD). The most-active prefix-related device is selected,
avoiding the broad "most active of all" heuristic that caused Docker
misattribution in #1737.
2026-02-26 16:59:12 -05:00
henrygd
004841717a add checks for non-empty CPU times during initialization (#401) 2026-02-25 19:04:29 -05:00
henrygd
12545b4b6d fix: dedupe root-mirrored extra filesystems during disk discovery (#1428) 2026-02-24 15:41:29 -05:00
henrygd
04600d83cc refactor: small go 1.26 updates and go fix changes 2026-02-19 18:04:33 -05:00
henrygd
5d8906c9b2 amd gpu: small refactor + trim "series" from device name 2026-02-19 17:39:13 -05:00
henrygd
1e3a44e05d agent: improve multiplexed logs detection for podman (#1755) 2026-02-18 17:45:37 -05:00
henrygd
311095cfdd harden against docker api path traversal
Validate container IDs (12-64 hex) in hub container endpoints and agent
Docker requests, and build Docker URLs with escaped path segments. Add
regression tests for traversal/malformed container inputs and safe
endpoint construction.
2026-02-18 17:33:00 -05:00
henrygd
e1c1e97f0a chore: update go version / go deps / changelog 2026-02-18 16:17:05 -05:00
henrygd
f6b2824ccc rename gpu_apple_unsupported.go to gpu_darwin_unsupported.go 2026-02-18 15:15:58 -05:00
henrygd
f17ffc21b8 gate apple gpu collectors + revert readme change 2026-02-18 14:57:41 -05:00
Robert Accettura
f792f9b102 Mac GPU Stats (#1747) 2026-02-18 14:51:30 -05:00
henrygd
1def7d8d3a agent: add dockerManager.retrySleep method to mock time.Sleep in tests 2026-02-18 13:45:03 -05:00
Elio Di Nino
ef92b254bf fix(agent): Retry Docker check on non-200 HTTP response (#1754)
The previous behavior only caught some errors including inaccessible
hosts, but not others like failed authentication or service
unavailability. This largely applies when using a socket proxy and
having the retry mitigates some erroneous behavior.
2026-02-18 13:42:58 -05:00
henrygd
283fa9d5c2 include GTT memory in AMD GPU metrics (#1569) 2026-02-13 20:06:37 -05:00
henrygd
04d54a3efc update sysfs amd collector to pull pretty name from amdgpu.ids (#1569) 2026-02-13 19:41:40 -05:00
henrygd
14ecb1b069 add nvtop integration and introduce GPU_COLLECTOR env var 2026-02-13 19:41:40 -05:00
VACInc
e816ea143a SMART: add eMMC health via sysfs (#1736)
* SMART: add eMMC health via sysfs

Read eMMC wear/EOL indicators from /sys/class/block/mmcblk*/device and expose in SMART device list. Includes mocked sysfs tests and UI tweaks for unknown temps.

* small optimizations for emmc scan and parsing

* smart: keep smartctl optional only for Linux hosts with eMMC

* update smart alerts to handle warning state

* refactor: rename binPath to smartctlPath and replace hasSmartctl with smartctlPath checks

---------

Co-authored-by: henrygd <hank@henrygd.me>
2026-02-12 15:27:42 -05:00
henrygd
dba3519b2c fix(agent): avoid mismatched root disk I/O mapping in docker (#1737)
- Stop using max-read fallback when mapping root filesystem to
diskstats.
- Keep root usage reporting even when root I/O cannot be mapped.
- Expand docker fallback mount detection to include /etc/resolv.conf and
/etc/hostname (in addition to /etc/hosts).
- Add clearer warnings when root block device detection is uncertain.
2026-02-10 18:12:04 -05:00
Sven van Ginkel
6b7845b03e feat: add fingerprint command to agent (#1726)
Co-authored-by: henrygd <hank@henrygd.me>
2026-02-06 14:32:57 -05:00
henrygd
2a3885a52e add check to make sure fingerprint file isn't empty (#1714) 2026-02-04 20:05:07 -05:00
henrygd
5452e50080 add DISABLE_SSH env var (#1061) 2026-02-04 18:48:55 -05:00
henrygd
79ca31d770 improve container network stats granularity by using bytes instead of MB
Changes container network statistics to use raw byte values instead of converting to megabytes agent-side, providing more accurate measurements for low-bandwidth containers. Maintains backward compatibility with older agents/hubs through fallback logic.

- Agent now sends Bandwidth field as [sent_bytes, recv_bytes] array
- Deprecated NetworkSent/NetworkRecv fields still populated for compatibility
- Hub and frontend fall back to deprecated fields when Bandwidth is zero
- Record averaging correctly handles both old and new formats
- TODO markers added for cleanup in version 0.19+
2026-01-31 14:05:55 -05:00
Bart van der Braak
41f3705b6b update LibreHardwareMonitorLib to 0.9.5 (#1697)
fixes #1130

* add RuntimeIdentifier and AppendRuntimeIdentifierToOutputPath to beszel_lhm.csproj

* add more default sensor filters for LHM

---------

Co-authored-by: henrygd <hank@henrygd.me>
2026-01-30 19:23:56 -05:00
henrygd
20324763d2 remove stale systemd services from tracking after deletion (#1594) 2026-01-29 19:34:44 -05:00
henrygd
c7f7f51c99 add experimental sysfs amd gpu collector (#737, #1569) 2026-01-29 18:35:57 -05:00
henrygd
afc19ebd3b write health_file to /dev/shm instead of /tmp if available (#1455) 2026-01-28 15:21:45 -05:00
Sven van Ginkel
42da1e5a52 Bug: Apply SELinux context after binary replacement (#1678)
- Move SELinux context handling to internal/ghupdate for reuse
- Make chcon a true fallback (only runs if semanage/restorecon unavailable)
- Handle existing semanage rules with -m (modify) after -a (add) fails
- Apply SELinux handling to both agent and hub updates
- Add tests with proper skip behavior for SELinux systems

---------

Co-authored-by: henrygd <hank@henrygd.me>
2026-01-27 17:39:17 -05:00
Matthew Stern
1de36625a4 [Agent] feat: parse ATA device statistics for temperature and future metrics (#1689)
* feat: add ATA Device Statistics parsing and fall back for SMART temp reading

* simplify ata device statistics structs and fix smartctl args tests

* simplify ata device statistics lookup to use page number only

---------

Co-authored-by: henrygd <hank@henrygd.me>
2026-01-26 19:05:55 -05:00
henrygd
cb5f944de6 battery: ensure current charge doesn't exceed full capacity (#1668) 2026-01-22 13:01:21 -05:00
henrygd
23c4958145 increase smartctl --scan timeout to 10 seconds (#1465) 2026-01-21 19:09:57 -05:00
henrygd
edb2edc12c use name-only matching for unique SMART devices (#1655)
Fall back to name-only matching (previous behavior) when a device name
appears only once, preserving RAID composite key support added in #1655.
2026-01-21 18:25:03 -05:00
Julian Nadeau
648a979a81 Add SMART_DEVICES_SEPARATOR + allow drives with the same name to be added with different types (e.g. raid controllers) (#1655)
* Add SMART_DEVICES_SEPARATOR to override ,

* Allow composite keys in smart devices for raid controller support
2026-01-21 17:58:20 -05:00
Tamás Vince
acaa9381fe fix: update smartctlArgs call to use hasExistingData flag (#1645) 2026-01-16 15:30:52 -05:00
henrygd
9ad3cd0ab9 fix: GPU ID collision between Intel and NVIDIA collectors (#1522)
- Prefix Intel GPU ID as i0 to avoid NVML/NVIDIA index IDs like 0
- Update frontend GPU engines chart to select a GPU by id instead of
assuming g[0]
- Adjust tests to use the new Intel GPU id
2026-01-12 17:27:35 -05:00
henrygd
383913505f agent: fix tegrastats VDD_SYS_GPU parsing
- Parse VDD_SYS_GPU <mW>/<mW> correctly

- Add regression test for GPU@ temp + VDD_SYS_GPU power
2026-01-12 16:12:36 -05:00
Vascolas007
ca8cb78c29 Jetson tegrastats regex pre jetpack5 (#1631)
* feat:Adding regex catching groups for GPU temperature and power in pre jetpack 5
2026-01-12 16:11:22 -05:00
henrygd
3279a6ca53 agent: add separate glibc build with NVML support (#1618)
purego requires dynamic linking, so split the agent builds:
- Default: static binary without NVML (works on musl/alpine)
- Glibc: dynamic binary with NVML support via purego

Changes:
- Add glibc build tag to conditionally include NVML code
- Add beszel-agent-linux-amd64-glibc build/archive in goreleaser
- Update ghupdate to use glibc binary on glibc systems
- Switch nvidia dockerfile to golang:bookworm with -tags glibc
2026-01-12 15:38:13 -05:00
henrygd
6a1a98d73f update build constraints to exclude nvml collector on arm64 (#1618) 2026-01-11 20:27:34 -05:00