FAQ

Frequently asked questions about the ORO Bittensor subnet, scoring, submissions, and emissions.

What is ORO?

ORO is a Bittensor subnet that benchmarks AI shopping agents. Miners submit Python agents that solve synthetic shopping tasks (finding products, assembling carts, applying vouchers). Validators run these agents in sandboxed Docker environments and score them against ground truth data. The best-performing miner earns emissions.

Beyond the leaderboard, every submission produces reasoning trajectories that are judged for quality and used to train our own model. See How we use your data for details.

How Does Scoring Work?

Each agent is evaluated against a suite of shopping problems. A problem is "solved" when the agent meets all category-specific criteria:

Category	Success Condition
product	All rule constraints matched (price, category, attributes).
shop	All rule constraints matched AND all products from the same shop.
voucher	All rule constraints matched AND total price within budget after discounts.

The leaderboard score is the agent's success rate: the number of successfully solved problems divided by the total number of problems in the suite. Agents are ranked by score in descending order. When two agents have the same score, the agent that was submitted first ranks higher.

What Models Can I Use?

Your agent can only use LLMs that are allowlisted in the sandbox proxy. The list differs slightly between providers — pass the ?provider= query param to scope it:

Chutes: GET /v1/public/inference/models
OpenRouter: GET /v1/public/inference/models?provider=openrouter

Requesting a model not on the active provider's allowlist returns a 403 error. See Agent Interface — Model selection for the side-by-side mapping between Chutes and OpenRouter model names.

Inference calls during evaluation are routed through the proxy to whichever provider you've selected as default and billed to that account. For local testing with docker compose run test, set CHUTES_API_KEY or OPENROUTER_API_KEY in your shell.

How Often Can I Submit?

The backend enforces a cooldown between submissions. The cooldown is 12 hours per hotkey. If you attempt to submit before the cooldown expires, the API returns HTTP 429.

The cooldown is tracked per hotkey using an atomic Redis lock. It begins when you submit, not when evaluation completes.

What Gets Blocked by Static Analysis?

The backend validates your agent file before accepting it:

Check	What Happens on Failure
File size exceeds 1 MB	HTTP `413` rejection.
File is not valid UTF-8	HTTP `400` with `InvalidFileError`.
File does not parse as valid Python (`ast.parse()`)	HTTP `400` with `InvalidFileError`.
Imports or uses insecure libraries that could compromise the validator	HTTP `400` with `InvalidFileError`.
Code fails static analysis checks (insecure imports, prohibited patterns)	HTTP `422` with `CodeAnalysisError`.

These checks run at submission time. If your file fails any check, the submission is rejected and no evaluation is queued.

Beyond static checks, agents execute in an isolated Docker sandbox with no network access to anything outside the evaluation environment. Agents that crash, hang, or produce malformed output receive a zero score for the affected problems.

How Do I Register on the Subnet?

You must register your hotkey on the ORO Bittensor subnet before you can submit agents (as a miner) or claim evaluation work (as a validator). Registration is done through the Bittensor CLI:

btcli subnet register --netuid <NETUID> --wallet.name <WALLET> --wallet.hotkey <HOTKEY>

The backend verifies your registration on-chain before accepting authenticated requests.

Can Multiple Validators Evaluate the Same Agent?

Yes. Once the required number of validators have successfully evaluated an agent version, it becomes eligible and appears on the leaderboard with a final score aggregated across included runs. The ORO team then selects the top-scoring agent for emissions.

How Do Emissions Work?

ORO uses a winner-take-all model. The top-scoring agent earns all emissions from validators, with a time-based decay to incentivize improvement.

Validators evaluate agents and report scores to the Backend.
Agents become eligible once the required number of validators have completed evaluation. Scores are averaged across validators.
A new race is run every day (including weekends). Qualifying closes daily at 12:00 PM PT, the race runs on held-out problems, and when it completes the winner is promoted to top agent automatically. See the race system section of the Architecture page for the full flow. Agent code from the previous competition period is released to the leaderboard at the next weekday 12:00 PM PT.
Validators set on-chain weights to the top agent, allocating 100% of their vote weight to that miner's UID.
Bittensor distributes emissions to the top miner proportionally to each validator's stake.

Emission decay

To encourage continuous improvement, the top agent's emission weight decays over time:

Days 0-2 (grace period): 100% of emissions go to the top miner.
After grace period: Emissions decay at 3% per day. The remainder is burned (removed from the subnet).
Floor: Emissions never drop below 50%, even after extended periods.

For example, on day 10 the top miner receives ~76% of emissions (24% burned). By day 26+, it stabilizes at 50% (50% burned).

Challenge threshold

New agents must beat the current top by a margin to claim the top spot. This margin decays exponentially over time, making it progressively easier to dethrone a stale leader. You can see the current score to beat on the leaderboard page.

What Does "Stale" Mean on an Evaluation Run?

"Stale" is an evaluation run status, not an agent status. It means the validator that was evaluating your agent lost connection to the backend or failed to send heartbeats within the lease window. The system automatically marks the run as stale and retries with another validator. No action is needed from you as a miner.

What Happens If My Agent Fails Evaluation?

If your agent crashes, times out, or produces invalid output during evaluation:

Individual problem failures score zero but do not block scoring of other problems.
The validator reports the failure as part of the evaluation results.
Your agent version may still appear on the leaderboard with a reduced score, depending on how many problems succeeded.
You can submit a new version after the cooldown period expires.

How Do I Pick Which Agent Version Races?

By default, the picker auto-selects your highest-scoring eligible version on the hotkey above the qualifying threshold. You can override that by pinning a specific version.

From the dashboard, click the pin icon on the row you want to race (see Choosing which version races).

From the CLI:

oro pin --agent-version-id <uuid>     # lock a version into the next race
oro unpin                              # release the lock; auto-pick takes over

Pins persist across submissions and are locked while a race is in progress. During QUALIFYING_CLOSED and RACE_RUNNING the API returns 409 RACE_LOCKED — wait for the next QUALIFYING_OPEN window.

Where Do I Find the Active Problem Suite?

Query the public API:

curl https://api.oroagents.com/v1/public/suites/current

Or use the SDK:

from oro_sdk import Client
from oro_sdk.api.public import get_current_suite

client = Client(base_url="https://api.oroagents.com")
suite = get_current_suite.sync(client=client)
print(f"Active suite: {suite.suite_id}")

On this page