vllm.distributed.kv_transfer.kv_connector.v1.lmcache_integration.utils ¶
_config_instance module-attribute ¶
apply_mm_hashes_to_token_ids ¶
apply_mm_hashes_to_token_ids(
token_ids: Tensor,
mm_hashes: list[str],
mm_positions: list[PlaceholderRange],
) -> Tensor
Overwrite token_ids in-place for multimodal placeholders using efficient slice assignments.
Source code in vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/utils.py
create_lmcache_metadata ¶
create_lmcache_metadata(
vllm_config=None,
model_config=None,
parallel_config=None,
cache_config=None,
)
Create LMCacheEngineMetadata from vLLM configuration.
This function extracts common metadata creation logic that was duplicated across multiple files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vllm_config | VllmConfig | vLLM configuration object containing model, parallel, and cache configs (alternative to individual config parameters) | None |
model_config | ModelConfig | Model configuration (alternative to vllm_config) | None |
parallel_config | ParallelConfig | Parallel configuration (alternative to vllm_config) | None |
cache_config | CacheConfig | Cache configuration (alternative to vllm_config) | None |
Source code in vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/utils.py
extract_mm_features ¶
extract_mm_features(
request: Union[Request, NewRequestData],
modify: bool = False,
) -> tuple[list[str], list[PlaceholderRange]]
Normalize multimodal information from a Request into parallel lists.
This helper reads either
1) request.mm_features (objects each exposing .identifier and .mm_position), or 2) legacy fields request.mm_hashes and request.mm_positions.
It returns two equally sized lists: the multimodal hash identifiers and their corresponding positions. If the request contains no multimodal info, it returns ([], []).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request | Request | The source object. | required |
modify | bool | Controls copy semantics for the legacy-path return values. - If True and legacy fields are used, shallow-copies are returned so the caller can mutate the lists without affecting | False |
Returns:
| Type | Description |
|---|---|
list[str] | tuple[list[str], list[PlaceholderRange]]: ( |
list[PlaceholderRange] | May be |
Source code in vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/utils.py
hex_hash_to_int16 ¶
is_false ¶
Check if the given string value is equivalent to 'false'.
lmcache_get_or_create_config ¶
Get the LMCache configuration from the environment variable LMCACHE_CONFIG_FILE. If the environment variable is not set, this function will return the default configuration.
This function is thread-safe and implements singleton pattern, ensuring the configuration is loaded only once.
Source code in vllm/distributed/kv_transfer/kv_connector/v1/lmcache_integration/utils.py
mla_enabled ¶
mla_enabled(model_config: ModelConfig) -> bool