vllm.v1.kv_offload.worker.worker ¶
OffloadingHandler ¶
Bases: ABC
OffloadingHandler class for managing asynchronous KV data transfers
This class runs in the worker. It kicks off async KV data transfer requests, and allows collecting back completion statuses.
The class provides the following primitives
transfer_async() - kicks off a new transfer job get_finished() - returns a list of newly finished job IDs.
Source code in vllm/v1/kv_offload/worker/worker.py
get_finished abstractmethod
¶
get_finished() -> list[TransferResult]
Get transfers finished since last call.
Returns:
Type | Description |
---|---|
list[TransferResult] | A list of (job_id, success) of transfers. |
transfer_async abstractmethod
¶
transfer_async(job_id: int, spec: TransferSpec) -> bool
Initiates an asynchronous transfer of KV data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id | int | a unique ID that will be used when notifying back on transfer completion. | required |
spec | TransferSpec | the (src, dst) spec of the KV data transfer. | required |
Returns:
Type | Description |
---|---|
bool | True if transfer was submitted successfully. |
Source code in vllm/v1/kv_offload/worker/worker.py
OffloadingWorker ¶
OffloadingWorker class for managing asynchronous KV data transfers using multiple OffloadingHandlers
This class runs in the worker. It kicks off async KV data transfer requests, by delegating to one of its registered OffloadingHandlers, based on the transfer type.
The class provides the following primitives
register_handler() - registers a new handler to handle a specific transfer type transfer_async() - kicks off a new transfer job using one of the registered handlers. get_finished() - returns a list of newly finished job IDs from all handlers.
Source code in vllm/v1/kv_offload/worker/worker.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
|
transfer_type_to_handler instance-attribute
¶
transfer_type_to_handler: dict[
TransferType, OffloadingHandler
] = {}
__init__ ¶
get_finished ¶
get_finished() -> list[TransferResult]
Get transfers finished since last call.
Returns:
Type | Description |
---|---|
list[TransferResult] | A list of (job_id, success) of transfers. |
Source code in vllm/v1/kv_offload/worker/worker.py
register_handler ¶
register_handler(
src_cls: type[LoadStoreSpec],
dst_cls: type[LoadStoreSpec],
handler: OffloadingHandler,
) -> None
Registers a new handler.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
src_cls | type[LoadStoreSpec] | the source type of transfers handled by this handler. | required |
dst_cls | type[LoadStoreSpec] | the destination type of transfers handled by this handler. | required |
handler | OffloadingHandler | the handler that will handle transfers. | required |
Source code in vllm/v1/kv_offload/worker/worker.py
transfer_async ¶
transfer_async(job_id: int, spec: TransferSpec) -> bool
Initiates an asynchronous transfer of KV data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id | int | a unique ID that will be used when notifying back on transfer completion. | required |
spec | TransferSpec | the (src, dst) spec of the KV data transfer. | required |
Returns:
Type | Description |
---|---|
bool | True if transfer was submitted successfully. |