FeaturesΒΆ
Compatibility MatrixΒΆ
The tables below show mutually exclusive features and the support on some hardware.
The symbols used have the following meanings:
- β = Full compatibility
- π = Partial compatibility
- β = No compatibility
- β = Unknown or TBD
Note
Check the β or π with links to see tracking issue for unsupported feature/hardware combination.
Feature x FeatureΒΆ
Feature | CP | APC | LoRA | SD | CUDA graph | pooling | enc-dec | logP | prmpt logP | async output | multi-step | mm | best-of | beam-search | prompt-embeds |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CP | β | ||||||||||||||
APC | β | β | |||||||||||||
LoRA | β | β | β | ||||||||||||
SD | β | β | β | β | |||||||||||
CUDA graph | β | β | β | β | β | ||||||||||
pooling | π * | π * | β | β | β | β | |||||||||
enc-dec | β | β | β | β | β | β | β | ||||||||
logP | β | β | β | β | β | β | β | β | |||||||
prmpt logP | β | β | β | β | β | β | β | β | β | ||||||
async output | β | β | β | β | β | β | β | β | β | β | |||||
multi-step | β | β | β | β | β | β | β | β | β | β | β | ||||
mm | β | β | π ^ | β | β | β | β | β | β | β | β | β | |||
best-of | β | β | β | β | β | β | β | β | β | β | β | β | β | ||
beam-search | β | β | β | β | β | β | β | β | β | β | β | β | β | β | |
prompt-embeds | β | β | β | β | β | β | β | β | β | β | β | β | β | β | β |
* Chunked prefill and prefix caching are only applicable to last-token pooling.
^ LoRA is only applicable to the language backbone of multimodal models.
Feature x HardwareΒΆ
Feature | Volta | Turing | Ampere | Ada | Hopper | CPU | AMD | TPU |
---|---|---|---|---|---|---|---|---|
CP | β | β | β | β | β | β | β | β |
APC | β | β | β | β | β | β | β | β |
LoRA | β | β | β | β | β | β | β | β |
SD | β | β | β | β | β | β | β | β |
CUDA graph | β | β | β | β | β | β | β | β |
pooling | β | β | β | β | β | β | β | β |
enc-dec | β | β | β | β | β | β | β | β |
mm | β | β | β | β | β | β | β | β |
logP | β | β | β | β | β | β | β | β |
prmpt logP | β | β | β | β | β | β | β | β |
async output | β | β | β | β | β | β | β | β |
multi-step | β | β | β | β | β | β | β | β |
best-of | β | β | β | β | β | β | β | β |
beam-search | β | β | β | β | β | β | β | β |
prompt-embeds | β | β | β | β | β | β | ? | β |