Two years ago people started talking about Small Language Model. “Soon you’ll see it on every device” they said. And I was quite agree, that it is good vector of future. Seems those been wrong, so am I. Let’s think why that (not) happened.

First, it still not possible. Good quality models, which mostly still have closed weights, requires many GB of memory and uses CPU/GPU like Crysis game. Technically we aren’t ready. Probably in near future Large models can be running on phones/laptops with no delay and less resources, due to Moore’s law.

Second, no one knew (by that time) exact problems they are solving. Are there problems that worth solving, like reason for good quality lens on small pocket device, or SIM card, or NFC, usually engineers quickly can solve that, but for LLMs on phones for now totally no reason, and maybe Musk a little bit right - era of edge devices is coming.