A enjoyable new LLM benchmark I noticed not too long ago: Bullshit Benchmark! The concept may be very easy: ask questions that do not make any sense to an LLM. Check whether or not it will probably spot that the query does not make any…
A enjoyable new LLM benchmark I noticed not too long ago: Bullshit Benchmark! The concept may be very easy: ask…
A enjoyable new LLM benchmark I noticed not too long ago: Bullshit Benchmark! The concept may be very easy: ask questions that do not make any sense to an LLM. Check whether or not it will probably spot that the query does not make any… Read More