“People always talk about these parameter counts, which is the size of the model. It’s basically how many different numbers am I multiplying, adding, doing these mathematical operations. When you input something to the language model, you tokenize it. Tokens are basically four letters.”

– Dylan Patel

Semaphor talks with Dylan Patel, who deconstructed how GPT-4 works. Patel and his colleagues at SemiAnalysis, a consulting firm, described their findings in an online paper.

Their findings showed how the large language model uses several much smaller models to solve pieces of the text GPT-4 is generating at any time.

They call these “experts.” It’s like breaking a big project into a collection of specialty areas and delegating tasks to each of them. SemiAnalyis says GPT-4 uses 16 experts. Using this approach saves time, speeding up the results and reducing the cost of computing.


Revealing the mysteries of ChatGPT | SEMAPHOR | July 28, 2023 | by Reed Albergotti