<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>HuggingFace Accelerate on Rauf Ibishov</title><link>http://raufibishov.com/tags/huggingface-accelerate/</link><description>Recent content in HuggingFace Accelerate on Rauf Ibishov</description><generator>Hugo -- gohugo.io</generator><language>en</language><copyright>© 2026 Rauf Ibishov</copyright><lastBuildDate>Thu, 01 Jan 2026 00:00:00 +0000</lastBuildDate><atom:link href="http://raufibishov.com/tags/huggingface-accelerate/index.xml" rel="self" type="application/rss+xml"/><item><title>AzNEOBERT — Azerbaijani BERT from Scratch on 12B Tokens</title><link>http://raufibishov.com/projects/az-neobert/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>http://raufibishov.com/projects/az-neobert/</guid><description>&lt;p&gt;&lt;em&gt;Status: &lt;strong&gt;In training&lt;/strong&gt; · Phase 1 active on 8× NVIDIA H200 · Component 2 of the AzBERT pipeline&lt;/em&gt;&lt;/p&gt;

&lt;h2 class="relative group"&gt;TL;DR
 &lt;div id="tldr" class="anchor"&gt;&lt;/div&gt;
 
 &lt;span
 class="absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none"&gt;
 &lt;a class="text-primary-300 dark:text-neutral-700 !no-underline" href="#tldr" aria-label="Anchor"&gt;#&lt;/a&gt;
 &lt;/span&gt;
 
&lt;/h2&gt;
&lt;p&gt;I am training AzNEOBERT — an Azerbaijani encoder language model from scratch, based on the NeoBERT architecture. The model runs on 8× NVIDIA H200 GPUs via SLURM, trained on ~12.2B tokens of Azerbaijani text across 11 corpus collections. Infrastructure stack: DeepSpeed ZeRO-2, Flash Attention 3, and &lt;code&gt;torch.compile&lt;/code&gt;, reaching 1.24M tokens/sec throughput. Loss dropped from 11.1 (random init) to 2.35 within the first 4,100 steps.&lt;/p&gt;</description><media:content xmlns:media="http://search.yahoo.com/mrss/" url="http://raufibishov.com/projects/az-neobert/feature.svg"/></item></channel></rss>