Company
Date Published
Author
Joel Hans
Word count
2139
Language
English
Hacker News points
None

Summary

This text discusses the use of ngrok to connect a local workstation's AI development workflow to remote CPU/GPU compute power, specifically for training and hosting custom AI models using large language models (LLMs). The authors aim to provide a proof-of-concept system that allows developers to train LLMs faster and more securely while maintaining good developer ergonomics. They highlight the limitations of local-first development workflows, such as requiring powerful hardware, taxing workstations, and making it difficult to collaborate with others or transition into an API. The authors propose using ngrok to connect to remote compute power, leveraging its universal ingress platform for securing and persisting ingress to AI services. A suggested tech stack includes a Linux virtual machine with GPU acceleration, Docker, Ollama, ollama-webui, and ngrok. The authors provide step-by-step instructions on launching the remote VM, installing and running Ollama and the web UI via Docker, setting up ngrok, and optimizing the setup for security and persistence. They conclude by highlighting the potential of this setup for secure, collaborative, and cost-effective AI development.