Documentation Index
Fetch the complete documentation index at: https://mintlify.com/FujiwaraChoki/MoneyPrinter/llms.txt
Use this file to discover all available pages before exploring further.
Ollama Models
MoneyPrinter uses local Ollama models for AI-powered script generation and metadata creation.Why Ollama?
Ollama provides local LLM inference with several advantages:Privacy
All AI processing happens locally. No data sent to cloud APIs.
Cost
Zero per-token charges. Run unlimited generations.
Speed
Low-latency inference with local models.
Flexibility
Choose from dozens of open-source models.
Installing Ollama
Download and Install
Start Ollama Service
http://localhost:11434 by default.
On macOS/Windows, Ollama runs as a background service after installation. No need to manually run
ollama serve.Pulling Models
Basic Usage
Pull a model:Recommended Models
For Script Generation
| Model | Size | RAM Required | Speed | Quality | Best For |
|---|---|---|---|---|---|
| llama3.1:8b | 4.7GB | 8GB | Fast | Excellent | General purpose, balanced |
| mistral:7b | 4.1GB | 8GB | Very Fast | Good | Quick iterations |
| llama3.1:70b | 40GB | 64GB | Slow | Outstanding | Best quality (GPU required) |
| qwen2.5:7b | 4.7GB | 8GB | Fast | Good | Multilingual content |
| phi3:mini | 2.3GB | 4GB | Very Fast | Fair | Low-resource environments |
Selecting a Model
Consider these factors:- Available RAM: Model size + 2GB overhead
- GPU: CUDA/Metal acceleration for larger models
- Script quality: Larger models produce better scripts
- Generation speed: Smaller models are faster
Configuring MoneyPrinter
Default Model
Set the fallback model in.env:
.env
Backend/gpt.py
Remote Ollama Server
Run Ollama on a different machine:.env
.env
Docker Configuration
When running MoneyPrinter in Docker with Ollama on the host:.env
extra_hosts in docker-compose.yml).
Model Selection in UI
The frontend fetches available models from the API:Frontend/app.js
Backend/main.py
Model Usage in Pipeline
Script Generation
Backend/gpt.py
Search Terms
Backend/gpt.py
Metadata Generation
Backend/gpt.py
Model Performance
Benchmarks
Approximate generation times on Apple M2 Pro:| Model | Script (1 para) | Search Terms | Metadata | Total |
|---|---|---|---|---|
| llama3.1:8b | 5s | 3s | 8s | ~16s |
| mistral:7b | 3s | 2s | 5s | ~10s |
| llama3.1:70b | 25s | 15s | 35s | ~75s |
| phi3:mini | 2s | 1s | 3s | ~6s |
Timings exclude video download, TTS, and rendering. GPU acceleration significantly improves performance.
Troubleshooting
No models in frontend dropdown
No models in frontend dropdown
Verify Ollama is running:Expected response:If connection refused:
Model not found error
Model not found error
Error message:Solution:The backend checks available models and provides clear instructions:
Backend/gpt.py
Ollama connection refused (Docker)
Ollama connection refused (Docker)
Check OLLAMA_BASE_URL:Test connectivity from container:Linux: Verify
.env
host.docker.internal resolves:Out of memory errors
Out of memory errors
Symptoms:
- Ollama crashes during generation
- System becomes unresponsive
- “Out of memory” errors
-
Use a smaller model:
- Limit concurrent generations (single worker)
-
Add swap space (Linux):
Advanced Configuration
Model Parameters
Customize generation parameters (requires code changes):Backend/gpt.py
Timeout Configuration
Adjust timeout for slow models:.env
Backend/gpt.py
Next Steps
Generating Videos
Create your first video
Pipeline
Understand the generation process
Configuration
All environment variables
Troubleshooting
Common issues and solutions