-
Notifications
You must be signed in to change notification settings - Fork 363
Refactor cdi api #1166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Refactor cdi api #1166
Conversation
Pull Request Test Coverage Report for Build 16001110444Details
💛 - Coveralls |
0a2b879
to
a6f8a10
Compare
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Refactors the CDI API to use a unified factory-based SpecGenerator pattern and consolidates NVML spec generation logic.
- Introduce
deviceSpecGeneratorFactory
,SpecGenerator
, andDeviceSpecGenerator
types and update all implementations accordingly - Replace redundant per-device methods with
fullGPUDeviceSpecGenerator
/migDeviceSpecGenerator
and a combinedDeviceSpecGenerators
type - Update wrapper and CLI to use
GetDeviceSpecsByID("all")
and deprecateGetAllDeviceSpecs
Reviewed Changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
pkg/nvcdi/wrapper.go | Added factory field and new SpecGenerator types |
pkg/nvcdi/mofed.go | Implement factory interface for MOFED |
pkg/nvcdi/mig-device-nvml.go | Extracted MIG spec generator type |
pkg/nvcdi/management.go | Implement factory interface for management mode |
pkg/nvcdi/lib.go | Wire deviceSpecGeneratorFactory into New |
pkg/nvcdi/lib-wsl.go | Implement factory interface for WSL mode |
pkg/nvcdi/lib-nvml_test.go | Add tests for NVML DeviceSpecGenerators |
pkg/nvcdi/lib-nvml.go | Major refactor: split Init/Shutdown and generators |
pkg/nvcdi/lib-imex.go | Implement factory interface for IMEX |
pkg/nvcdi/lib-csv.go | Implement factory interface for CSV mode |
pkg/nvcdi/gds.go | Implement factory interface for GDS mode |
pkg/nvcdi/full-gpu-nvml.go | Extracted full GPU spec generator type |
pkg/nvcdi/api.go | Updated Interface to use SpecGenerator |
cmd/nvidia-ctk/cdi/generate/generate.go | Replace deprecated GetAllDeviceSpecs call |
Comments suppressed due to low confidence (4)
pkg/nvcdi/lib-nvml.go:108
- [nitpick] The local variable
DeviceSpecGenerators
shadows the typeDeviceSpecGenerators
, which may confuse readers. Rename the variable (e.g.,generators
) to avoid shadowing.
var DeviceSpecGenerators DeviceSpecGenerators
pkg/nvcdi/wrapper.go:39
- [nitpick] Add a Go doc comment for
deviceSpecGeneratorFactory
to explain its role in producingDeviceSpecGenerator
instances.
// TODO: Rename this type
pkg/nvcdi/lib-nvml.go:47
- [nitpick] Expand this comment to specify the exact ID formats supported (e.g.,
gpuIndex
,uuid
, orgpuIndex:migIndex
for MIG devices) so consumers know how to request each device.
// DeviceSpecGenerators returns the CDI device spec generators for NVML devices
pkg/nvcdi/lib-nvml_test.go:37
- Consider adding a test case for invalid device IDs (e.g., an unsupported string) to verify that
getDeviceSpecGeneratorsForIDs
returns an appropriate error.
expectedLength int
|
||
var _ DeviceSpecGenerator = (*migDeviceSpecGenerator)(nil) | ||
|
||
func (l *nvmllib) newMIGDeviceSpecGeneratorFromNVMLDevice(id string, nvmlDevice nvml.Device) (DeviceSpecGenerator, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fields index
and migIndex
on migDeviceSpecGenerator
are never initialized, causing getNames()
to always use zero indices. Pass and set the correct index and migIndex when constructing the generator.
func (l *nvmllib) newMIGDeviceSpecGeneratorFromNVMLDevice(id string, nvmlDevice nvml.Device) (DeviceSpecGenerator, error) { | |
func (l *nvmllib) newMIGDeviceSpecGeneratorFromNVMLDevice(id string, nvmlDevice nvml.Device, index int, migIndex int) (DeviceSpecGenerator, error) { |
Copilot uses AI. Check for mistakes.
The original CDI spec generation API was focussed on NVML device specifically. Since then we have replaced the more specific functions (for GPU and MIG devices) in the API with more generally applicable functions based on mode and device IDs.
This organic growth of APIs also means that for the NVML case specifically we had multiple different implementations of CDI spec generation making keeping things consistent more difficult.
Thes changes remove the redundant functions in the
nvcdi.Interface
allowing devices to be requested by ID across all use cases. It also refactors the CDI spec generation for NVML devices to ensure that the same generation logic is used for all cases.