czech-beer-brand-name
Overview
- Environment ID:
czech-beer-brand-name - Short description: Predict a Czech beer brand from bottle appearance descriptions.
- Tags:
czech,beer,classification,vision,ocr
Task
- Type: single-turn classification
- Output format expectations: reasoning allowed, final brand must be inside
<guess>...</guess> - Rubric:
brand_exact_match_weighted(weight1.0): exact canonical match using aliases, scaled by description level difficulty.brand_exact_match(metric only): unweighted exact match for plain accuracy tracking.format_reward(weight0.05): requires a valid XML answer tag format (default<guess>...</guess>).
No Hardcoded Brands
- Brand names are not defined in code.
- The environment derives the full brand set from
dataset_path(andeval_dataset_pathif provided). - Optional aliases can be provided per row via
aliases.
Image Dataset Workflow
Put bottle images in:
environments/czech_beer_brand_name/data/images/
Use either:
- Filename labels (quick start):
pilsner_urquell_01.jpgkozel_12.png
- Explicit labels file (
labels.jsonl):
- one row per image with
image_pathandbrand - copy template from
environments/czech_beer_brand_name/data/labels.template.jsonl
Generate OCR descriptions (bottle/logo/distinct features, no taste):
export OPENROUTER_API_KEY=your_key_here
python environments/czech_beer_brand_name/scripts/build_image_ocr_dataset.py \
--images-dir environments/czech_beer_brand_name/data/images \
--labels-jsonl environments/czech_beer_brand_name/data/labels.jsonl \
--out environments/czech_beer_brand_name/data/beer_bottles_ocr.jsonl \
--model minimax/minimax-m2
Output row format:
- required:
brand,description - optional:
description_short,description_brief,description_minimal - optional:
aliases(list),image_path,source,label_method,id
Generate 3 or 4 description levels from existing full descriptions:
export OPENROUTER_API_KEY=your_key_here
python environments/czech_beer_brand_name/scripts/generate_description_levels.py \
--in environments/czech_beer_brand_name/data/beer_bottles_ocr.jsonl \
--out environments/czech_beer_brand_name/data/beer_bottles_ocr_levels.jsonl \
--model openai/gpt-4.1-mini \
--num-levels 3
Environment Arguments
| Arg | Type | Default | Description |
|---|---|---|---|
dataset_path | str | required | JSONL with labeled OCR rows |
eval_dataset_path | str | "" | optional separate eval JSONL |
eval_size | int | 40 | split size if no eval_dataset_path |
include_candidates | bool | true | include candidate brand list in prompt |
seed | int | 7 | deterministic split/shuffle seed |
description_levels | str | "detailed,short,brief" | comma-separated levels to include (detailed,short,brief,minimal) |
detailed_level_reward | float | 1.0 | reward multiplier for detailed descriptions |
short_level_reward | float | 1.0 | reward multiplier for short descriptions |
brief_level_reward | float | 1.0 | reward multiplier for brief descriptions |
minimal_level_reward | float | 1.0 | reward multiplier for minimal descriptions |
wrong_answer_penalty | float | -0.1 | reward returned when predicted brand is wrong or unparsable |
format_reward_weight | float | 0.0 | rubric weight for XML format reward metric |
answer_tag | str | "guess" | XML tag used for final answer (model must answer in <tag>...</tag>) |
Quickstart
prime eval run czech-beer-brand-name -m gpt-4.1-mini -n 20 -r 1 \
-a '{"dataset_path":"environments/czech_beer_brand_name/data/beer_bottles_ocr.jsonl","eval_size":8}' -s