Examples¶

vLLM's examples are split into three categories:

If you are using vLLM from within Python code, see the Offline Inference section.
If you are using vLLM from an HTTP application or client, see the Online Serving section.
For examples of using some of vLLM's advanced features (e.g. LMCache or Tensorizer) which are not specific to either of the above use cases, see the Others section.