Local AI with Ollama and Cerewro: confidential data stays on your machine

Ollama runs large language models locally on Windows without internet connection. By connecting Ollama with Cerewro, you can process confidential documents with full AI power without any data leaving your network.

Install Ollama and recommended models

winget install Ollama.Ollama
ollama pull llama3.2        # 3B params, 2GB RAM, very fast
ollama pull llama3.1:8b     # 8B params, 8GB RAM, very capable
ollama pull mistral         # 7B params, great for European languages
ollama pull qwen2.5:7b      # 7B, excellent for code and analysis

Situation	Cloud AI risk	Local AI solution
Law firm	Professional secrecy prohibits sending client docs to third parties	Data never leaves the firm's server
Medical practice	Health data (special category GDPR) requires explicit consent	On-premise processing complies with GDPR
Banking data	PCI-DSS prohibits sending card data to third parties	Local model for fraud analysis

GPU accelerates 10x: If your machine has an NVIDIA GPU with 8GB+ VRAM, Ollama uses it automatically to accelerate inference up to 10 times.