The concept “understand” is rooted in utility. It means “I have built a much sim...

The concept “understand” is rooted in utility. It means “I have built a much simpler model which produces usefully accurate predictions, of the thing or behaviour I seek to ‘understand’”. This utility is “explanatory power”. The model may be in your head, may be math, may be an algorithm or narrative, it may be a methodology with a history of utility. “Greater understanding” is associated with models that are simpler, more essential, more accurate, more useful, cheaper, more decomposed, more composable, more easily communicated or replicated, or more widely applicable.

“Pattern matching”, “next token prediction”, “tensor math” and “gradient descent” or the understanding and application of these by specialists, are not useful models of what LLMs do, any more than “have sex, feed and talk to the resulting artifact for 18 years” is a useful model of human physiology or psychology.

My understanding, and I'm not a specialist, is there are huge and consequential utility gaps in our models of LLMs. So much so, it is reasonable to say we don't yet understand how they work.