Introducing Prem-1B
Blog post from Prem AI
Prem AI has introduced the Prem-1B series, an open-source large language model designed to democratize access to advanced language model capabilities traditionally restricted to closed-model APIs. The model, available on HuggingFace under an Apache License 2.0, is optimized for Retrieval-Augmented Generation (RAG) and features an extended context length of 8192 tokens to efficiently handle multi-turn conversations. The infrastructure for model training employs 16 H100 GPUs, interconnected through Ray to enable multi-GPU training, and the architecture is based on a transformer decoder-only model similar to Llama 2. The pre-training process utilized SlimPajama and Llama's tokenizer to efficiently handle a data corpus of 600 billion tokens, while chat fine-tuning adapted the model for conversational use. Additionally, Direct Preference Optimization (DPO) was employed to align the model's responses with human preferences, resulting in competitive performance across various benchmarks. Future plans involve enhancing the model's performance and exploring model alignment techniques, with a focus on expanding the quality of data used in training and fine-tuning processes.