Random Control
Establishes the baseline. This bot chooses future click positions randomly and should converge toward a 50/50 distribution with average reward near zero.
Future Randomness Reward System
A backend timing laboratory where separate automated agents choose future click positions before a drand quicknet round is public. Each decision selects a deterministic 10-bit slice, then receives reward for 1-dominance.
Hypothesis is that a machine learning system, as a stand-in for the brain, can influence or predict future randomness through its reward system, which acts as a stand-in to human/animal intention or desire.
Establishes the baseline. This bot chooses future click positions randomly and should converge toward a 50/50 distribution with average reward near zero.
Uses a transparent algorithmic search process without true machine learning. It tests whether simple optimization can appear to find timing patterns.
Uses a reward-trained model to select future timing positions. This is the primary experimental agent, because it most closely represents desire-driven learning.
The ML bot trains on scored prior decisions, predicts reward from timing features, and chooses future click positions before the target drand round is revealed.
The math bot uses genome selection and mutation to search timing positions. It is not the main ML target; it is a rule-based comparison group.
This is the most important sanity check. The random bot should average toward 50% ones and zero reward over time. Any strong persistent structure here means the extraction rule or data pipeline needs debugging before interpreting target-agent results.
Each bot chooses a click time from 0–3000ms before the future quicknet round is public.
When the drand round arrives, the chosen timing maps onto a fixed 10-bit slice.
More 1s creates positive reward. Fewer 1s creates negative reward.
The ML bot is judged against both random control and math/evolution control.