New Set of German Language Models
In a joint effort with DiscoResearch we release at set of new German language models, available on huggingface. All models are based on Llama-3-8B and were continually pre-trained on 65B high-quality German tokens from our occiglot-fineweb dataset. Similar to our prior releases, we provide both a base and instruction-tuned versions of the model. In addition to these variants that were solely trained on 8k context, we also release a long context variant (DiscoResearch/Llama3_German_8B_32k)....