henrygd
|
0526c88ce0
|
support blacklisting and wildcard matching in SENSORS env var (#650)
- Moved sensor related code to sensors.go
- Added SensorConfig struct
- Added newSensorConfig
- Added tests
|
2025-04-17 21:08:05 -04:00 |
|
henrygd
|
d79111fce4
|
remove nvidia-smi dependency for jetson / tegrastats (#286)
|
2025-04-07 20:02:14 -04:00 |
|
henrygd
|
410d236f89
|
fix EXTRA_FILESYSTEMS for windows (#422)
Co-authored-by: coosir <git@coosir.com>
|
2025-04-05 17:57:34 -04:00 |
|
henrygd
|
968ca70670
|
agent temperature fixes (#648, #663)
- Fixes a bad sensor returning an error instead of other good sensors
- Adds ability to set GPU as PRIMARY_SENSOR
|
2025-03-15 00:29:41 -04:00 |
|
henrygd
|
c38d04b34b
|
Add health command for hub and align agent health command
|
2025-03-15 00:23:12 -04:00 |
|
henrygd
|
edefc6f53e
|
add health check for agent
- Updated command-line flag parsing.
- Moved GetAddress and GetNetwork to server.go
|
2025-03-14 03:33:25 -04:00 |
|
henrygd
|
521be05bc1
|
gpu.go refactoring and jetson fixes
- Fixed usage and power values
- Added new test cases
- Moved some variables to constants
|
2025-03-13 21:32:53 -04:00 |
|
henrygd
|
f397ab0797
|
fix: improve error logging for temperature sensor retrieval
|
2025-03-06 05:38:49 -05:00 |
|
henrygd
|
d25c7c58c1
|
fix: SYS_SENSORS context error (#643)
|
2025-03-06 05:36:20 -05:00 |
|
henrygd
|
6767392ea8
|
refactor: update some types in docker.go
|
2025-03-05 23:40:23 -05:00 |
|
henrygd
|
0443a85015
|
fix: correct typo in Docker stats collection variable name
|
2025-03-04 17:39:49 -05:00 |
|
henrygd
|
c4d8deb986
|
feat: agent data cache to support connections to multiple hubs (#341)
|
2025-03-04 16:25:45 -05:00 |
|
henrygd
|
681286eb4f
|
fix: add User-Agent to resolve Docker Desktop bug (#513, #603)
- also added body closure I forgot earlier whoops
|
2025-03-04 01:56:22 -05:00 |
|
henrygd
|
31431fd211
|
refactor: improve GPU data parsing
- Use byte-based regex matching instead of string-based matching
- Increase buffer size for GPU data
- Switch to `bufio.Scanner`
|
2025-03-04 00:15:10 -05:00 |
|
henrygd
|
ba7db28e80
|
test(gpu): add case for AMD multi-GPU and different power property (#414)
|
2025-02-22 12:45:47 -05:00 |
|
henrygd
|
6b41a98338
|
gpu: add tests and refactor to support amd on windows
|
2025-02-21 00:56:40 -05:00 |
|
henrygd
|
baf56fe83b
|
fix: refresh interfaces if agent starts before network online (#466)
|
2025-02-21 00:21:47 -05:00 |
|
henrygd
|
96f9128d1a
|
agent: add lock for gatherStats
|
2025-02-21 00:20:41 -05:00 |
|
henrygd
|
7485f79071
|
refactor(agent): refactor option parsing logic for agent command
|
2025-02-19 19:39:24 -05:00 |
|
henrygd
|
d170e7a00d
|
feat(agent): NETWORK env var and support for multiple keys
- merges agent.Run with agent.NewAgent
- separates StartServer method
- bumps go version to 1.24
- add tests
|
2025-02-19 00:32:27 -05:00 |
|
henrygd
|
5ea6eb08a1
|
feat: PRIMARY_SENSOR env var to choose dashboard temp
|
2025-02-11 15:11:46 -05:00 |
|
henrygd
|
3afab00937
|
feat: display peak GPU usage in dashboard
|
2025-02-08 19:24:38 -05:00 |
|
henrygd
|
e6054058b9
|
feat: add temperatures to dashboard
- Refactor temperature related code and move to standalone function
|
2025-02-07 21:27:15 -05:00 |
|
Henry Dollman
|
83668e5727
|
fix(gpu): handle power for dedicated amd gpus (#414)
|
2025-01-30 20:28:31 -05:00 |
|
Henry Dollman
|
120aff0d18
|
config: prefix environment variables with BESZEL_AGENT_ (#502)
|
2025-01-29 20:13:07 -05:00 |
|
hank
|
76347f25e5
|
fix(gpu): prevent nvidia-smi from running on tegra devices
|
2025-01-24 23:12:39 -05:00 |
|
hank
|
c157f38957
|
gpu: Add closure for Jetson and improve compatibility
|
2025-01-24 22:07:37 -05:00 |
|
Links
|
d185dfdef8
|
get Jetson GPU Information
|
2025-01-24 19:17:33 -05:00 |
|
Henry Dollman
|
1ac165d7d3
|
include stats in error log when encoding stats fails
|
2025-01-05 17:58:38 -05:00 |
|
Henry Dollman
|
8e531e6b3c
|
fix: handle duplicate GPU names (#361)
|
2025-01-05 16:40:22 -05:00 |
|
Henry Dollman
|
b08219dacf
|
refactor agent gpu code to make it easier to add intel / jetson
|
2024-12-17 17:12:58 -05:00 |
|
Henry Dollman
|
b4bc8a31aa
|
add check / reset for invalid disk i/o rates
|
2024-11-24 15:56:12 -05:00 |
|
Henry Dollman
|
4cb7b97416
|
change podman socket path to use current uid
|
2024-11-12 18:14:43 -05:00 |
|
Henry Dollman
|
b1db450e00
|
enable gpu monitoring by default
|
2024-11-12 18:13:57 -05:00 |
|
Henry Dollman
|
2e8ac98924
|
Improve disk discovery slightly by checking partition labels
|
2024-11-12 18:11:44 -05:00 |
|
Henry Dollman
|
3cd11d6bc4
|
improve podman support (#211)
|
2024-11-12 11:59:56 -05:00 |
|
Henry Dollman
|
03de73560c
|
add gpu power consumption chart
|
2024-11-08 20:31:22 -05:00 |
|
Henry Dollman
|
cd10727795
|
gpu usage and vram charts
|
2024-11-08 18:00:30 -05:00 |
|
Henry Dollman
|
8262a9a45b
|
progress on gpu metrics
|
2024-11-08 16:52:50 -05:00 |
|
Henry Dollman
|
655bfc95ca
|
add ability to specify partition for extra disk using folder name
|
2024-11-04 20:52:27 -05:00 |
|
Henry Dollman
|
741575df15
|
revert tweaks for old docker. needs more testing.
|
2024-11-02 14:43:35 -04:00 |
|
Henry Dollman
|
df0f3a154f
|
rtl layout progress and updates to arabic translations
|
2024-10-31 16:48:28 -04:00 |
|
Henry Dollman
|
f8fc74116c
|
rm *sensors.Warnings conversion - gopsutil windows uses different type
|
2024-10-26 14:02:19 -04:00 |
|
Henry Dollman
|
4094df3a61
|
fix: skip temperature collection if SENSORS is empty string (#196)
|
2024-10-24 15:10:20 -04:00 |
|
Henry Dollman
|
4a78ce1b16
|
skip temperatures code if sensors whitelist is set to empty string
|
2024-10-23 18:37:38 -04:00 |
|
Henry Dollman
|
539c0ccb1d
|
retry failed containers separately so we can run them in parallel (#58)
|
2024-10-21 17:00:13 -04:00 |
|
Henry Dollman
|
b5c158d1b3
|
update debug logs
|
2024-10-19 18:12:25 -04:00 |
|
Henry Dollman
|
8bf7a0e1d6
|
add DOCKER_TIMEOUT env var
|
2024-10-19 16:33:33 -04:00 |
|
Henry Dollman
|
ee92e338cb
|
update debug log locations
|
2024-10-16 18:12:43 -04:00 |
|
Henry Dollman
|
59d541dd1d
|
fix edge case overwriting extra filesystem with root io fallback
|
2024-10-16 15:26:12 -04:00 |
|