Every time someone suggests Codex I give it a shot. And every time it disappoint...

baq · 2026-02-17T11:38:41 1771328321

moral of the story is use both (or more) and pick the one that works - or even merge the best ideas from generated solutions. independent agentic harnesses support multi-model workflows.

enraged_camel · 2026-02-17T12:33:27 1771331607

I don't think that's the moral of the story at all. It's already challenging enough to review the output from one model. Having to review two, and then comparing and contrasting them, would more than double the cognitive load. It would also cost more.

I think it's much more preferable to pick the most reliable one and use it as the primary model, and think of others as fallbacks for situations where it struggles.

baq · 2026-02-17T12:50:16 1771332616

you should always benchmark your use cases and you obviously don't review multiple outputs; you only review the consensus.

see how perplexity does it: https://www.perplexity.ai/hub/blog/introducing-model-council