vllm.transformers_utils.configs.dotsocr ¶
DotsOCRConfig ¶
Bases: Qwen2Config
Source code in vllm/transformers_utils/configs/dotsocr.py
DotsVisionConfig ¶
Bases: PretrainedConfig
Source code in vllm/transformers_utils/configs/dotsocr.py
__init__ ¶
__init__(
embed_dim: int = 1536,
hidden_size: int = 1536,
intermediate_size: int = 4224,
num_hidden_layers: int = 42,
num_attention_heads: int = 12,
num_channels: int = 3,
patch_size: int = 14,
spatial_merge_size: int = 2,
temporal_patch_size: int = 1,
rms_norm_eps: float = 1e-05,
use_bias: bool = False,
attn_implementation="flash_attention_2",
initializer_range=0.02,
init_merger_std=0.02,
is_causal=False,
post_norm=True,
gradient_checkpointing=False,
**kwargs: Any,
)