With the broadening applications of deep learning, neural decoders have emerged as a key research focus, specifically aimed at improving the decoding performanc ...
SlopCodeBench evaluates coding agents under iterative specification refinement: the agent implements a spec, then extends its own code as the spec changes. This exposes behaviors that single-shot ...
For additional results and analyses, including the impact of pre-training, number of training subjects, normalization effects, and other key findings, please refer to our full paper. Here, "root" ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results