Private AI4Science Leaderboard

Enter the shared password to view the dashboard.

ai4science Leaderboard

Representative materials-science and chemistry tasks evaluated across providers, model settings, zero-shot repeats, and in-context learning experiments.

Comparison within the selected provider

This plot compares models from one provider at a time. The Pareto front is recomputed using only the selected provider's models.

Comparison across providers

This plot compares all providers for the selected task and experiment. The Pareto front is recomputed globally across providers.

Aggregate table

The table keeps the aggregate repeated-run view: mean performance, variation, mean cost, Pareto frequency, and run ids.