Introducing Dictalm -- A Large Generative Language Model For Modern Hebrew · The Large Language Model Bible Contribute to LLM-Bible

Introducing Dictalm -- A Large Generative Language Model For Modern Hebrew

Shmidman Shaltiel, Shmidman Avi, Cohen Amir David Nissan, Koppel Moshe. Arxiv 2023

[Paper]    
Fine Tuning Pretraining Methods Training Techniques

We present DictaLM, a large-scale language model tailored for Modern Hebrew. Boasting 7B parameters, this model is predominantly trained on Hebrew-centric data. As a commitment to promoting research and development in the Hebrew language, we release both the foundation model and the instruct-tuned model under a Creative Commons license. Concurrently, we introduce DictaLM-Rab, another foundation model geared towards Rabbinic/Historical Hebrew. These foundation models serve as ideal starting points for fine-tuning various Hebrew-specific tasks, such as instruction, Q&A, sentiment analysis, and more. This release represents a preliminary step, offering an initial Hebrew LLM model for the Hebrew NLP community to experiment with.

Similar Work