
We build intelligent systems that can plan, adapt, and learn across extended time horizons, turning complex, multi-step goals into reliable outcomes.
Intelligence is a long game: planning, feedback, improvement.
Latest synthesis of public forecasts and community evidence on AGI timelines/capabilities.
Central tendency across major forecasts.
Optimistic—but sourced—first-demo estimates.
Share of researchers expecting AGI before 2060.
Combined sample used for our internal outlook.
We track signals from scaling trends, benchmark inflection points, and evaluation studies.
Advance the science of long-horizon autonomous systems, capable tool-use, robust alignment, and verifiable safety, while delivering efficient, domain-specific models that solve real tasks today.
Algorithms that plan and execute multi-step tasks over time.
Systems that choose, compose, and even generate tools.
Keep systems safe and value-consistent under distribution shift.
Hard, reproducible measurement for capability and safety.
Smaller SLMs and specialists that matter in production.
Our stack is built for reproducibility and scale.
We maintain rotating, leak-resistant test suites.
long-horizon task graphs
multi-API orchestration tasks
safe exploration & robustness
Planned cadence: results with every milestone; code when safe.
Peer-reviewed advances in long-horizon intelligence.
Open tasks and evaluation harnesses.
Open-source implementations where safe and useful.
Negotiation, delegation, and joint planning primitives with evals.
Runtime self-critique, repair, and verification loops for long tasks.
Compact specialists tuned for targeted workflows and low-latency use.
One harness for planning, tool-use, robustness, and safety metrics.
multi-agent planning prototypes; tool-use orchestrators; early self-reflection loops; internal eval tracks.
unified eval harness; domain-specific SLMs; published benchmarks + ablation studies.
Rigorous testing through hidden test rotations, adversarial validation, and efficiency-first SLM designs ensure robust, scalable solutions.