Appearance
Model Lineup
Weaver supports a variety of open-weight models for fine-tuning.
Supported Models
Nex-AGI Optimized Models
These are optimized versions of base models, fine-tuned for enhanced reasoning and instruction-following capabilities.
| Model ID | Parameters | Type | Context Length |
|---|---|---|---|
nex-agi/Qwen3-30B-A3B-Nex-N1 | 30B (3B active) | MoE | 128K |
nex-agi/Qwen3-32B-Nex-N1 | 32B | Dense | 128K |
nex-agi/DeepSeek-V3.1-Nex-N1 | 671B (37B active) | MoE | 128K |
Qwen Series
| Model ID | Parameters | Type | Context Length |
|---|---|---|---|
Qwen/Qwen3-8B | 8B | Dense | 128K |
Qwen/Qwen3-32B | 32B | Dense | 128K |
Qwen/Qwen3-30B-A3B | 30B (3B active) | MoE | 128K |
Qwen/Qwen3-235B-A22B | 235B (22B active) | MoE | 128K |
DeepSeek Series
| Model ID | Parameters | Type | Context Length |
|---|---|---|---|
deepseek-ai/DeepSeek-V3.1 | 671B (37B active) | MoE | 128K |
deepseek-ai/DeepSeek-V3.2 | 671B (37B active) | MoE | 128K |
Model Types
- Dense: Standard transformer architecture with all parameters active
- MoE (Mixture of Experts): Only a subset of parameters are active per token, enabling larger models with similar computational cost
Choosing a Model
For agent scenarios:
- Prioritize Nex-AGI optimized models (
nex-agi/*series) for superior reasoning and tool-use capabilities - These models are specifically fine-tuned for agentic workflows and multi-step problem solving
For other applications:
- Start with
Qwen/Qwen3-8BorQwen/Qwen3-32Bfor balanced performance deepseek-ai/DeepSeek-V3.1for best quality- Use Nex-AGI optimized versions for enhanced reasoning capabilities
- MoE models offer strong performance with efficient inference
Usage
To use any model, specify its Model ID when creating a training client:
python
from weaver import ServiceClient
service_client = ServiceClient()
training_client = service_client.create_model(
base_model="Qwen/Qwen3-8B", # Replace with your chosen model
lora_config={"rank": 32}
)Next Steps
- Learn about Training and Sampling - Core APIs
- Explore Loss Functions - Available losses
- Understand Saving and Loading - Model persistence