loaders.model
loaders.model
Model loader class implementation for loading, configuring, and patching various models.
Classes
| Name | Description |
|---|---|
| ModelLoader | Manages model configuration, initialization and application of patches during |
ModelLoader
loaders.model.ModelLoader(
cfg,
tokenizer,
*,
inference=False,
reference_model=False,
**kwargs,
)Manages model configuration, initialization and application of patches during model loading.
This class orchestrates the entire process of loading a model from configuration to final preparation. It handles device mapping, quantization, attention mechanisms, adapter integration, and various optimizations.
The loading process includes
- Loading and validating model configuration
- Applying monkey patches for optimizations / fixes
- Setting up device mapping (including multi-GPU configurations)
- Configuring quantization
- Setting attention mechanisms (Flash Attention, SDPA, etc.)
- Loading and initializing the model
- Applying adapters (LoRA, QLoRA, etc.)
Attributes
| Name | Type | Description |
|---|---|---|
| model | PreTrainedModel | PeftModel | PeftMixedModel | The loaded model instance (available after load() is called). |
| model_kwargs | dict[str, Any] | Dictionary of keyword arguments passed to model initialization. |
| base_model | Name or path of the base model to load. | |
| model_type | Type of model to load (e.g., AutoModelForCausalLM). |
|
| model_config | Configuration object for the model. | |
| auto_model_loader | class used for loading the model (default: AutoModelForCausalLM). |
Methods
| Name | Description |
|---|---|
| load | Load and prepare the model with all configurations and patches. |
load
loaders.model.ModelLoader.load()Load and prepare the model with all configurations and patches.
Returns
| Name | Type | Description |
|---|---|---|
| tuple[PreTrainedModel | PeftModelForCausalLM, PeftConfig | None] | A tuple with the loaded model and its LoRA configuration (if applicable). |