Interactive Physical Reasoning
ICLR 2024
I-PHYRE benchmark contains 40 interactive physics games mainly consisting of gray blocks, black blocks, blue blocks, and red balls. These games share the same goal of getting the red balls into the abyss, which can be reached by eliminating gray blocks (the only ones that can be eliminated). Besides the fixed static blocks (gray and black), we add dynamic blue blocks moving under gravity, springs, and rigid sticks to the environment to extensively enrich the physical dynamics.
Abstract
Current evaluation protocols predominantly assess physical reasoning in stationary scenes, creating a gap in evaluating agents' abilities to interact with dynamic events. While contemporary methods allow agents to modify initial scene configurations and observe consequences, they lack the capability to interact with events in real time. To address this, we introduce I-PHYRE, a framework that challenges agents to simultaneously exhibit intuitive physical reasoning, multi-step planning, and in-situ intervention. Here, intuitive physical reasoning refers to a quick, approximate understanding of physics to address complex problems; multi-step denotes the need for extensive planning sequences in I-PHYRE, considering each intervention can significantly alter subsequent choices; and in-situ implies the necessity for timely object manipulation within a scene, where minor timing deviations can result in task failure. We formulate four game splits to scrutinize agents' learning and generalization of essential principles of interactive physical reasoning, fostering learning through interaction with representative scenarios. Our exploration involves three planning strategies and examines several supervised and reinforcement agents' zero-shot generalization proficiency on I-PHYRE. The outcomes highlight a notable gap between existing learning algorithms and human performance, emphasizing the imperative for more research in equipping agents with interactive physical reasoning capabilities.
Demo
Play
We recommend playing the games locally through a simple installation process to achieve real-time rendering and response.

Instructions:

  • Your only goal is to get all the red balls into the abyss by eliminating gray blocks. Click the center of the gray block to eliminate it.
  • These games have a strict time threshold of 15 seconds after which you will fail.
  • Players are encouraged to find the least time-consuming solution with the least possible effort. The rewards are designed as below.
  • Event Reward Account
    Forwarding one second -1 Encourage time-efficiency
    Eliminating a block -10 Encourage action-efficiency
    Reaching the goal +1000 Encourage problem-solving
  • Refer to the introduction above for the properties of different bodies.
  • You will have a maximum of 5 chances to play each of these 40 games. The game will pass once you have succeeded.
  • Follow the steps below to start your play. It takes about 40 mins to finish all.
  • Step 1

    Set the environment

    Run the commands below in your terminal:

    conda create -n iphyre python=3.10

    conda activate iphyre

    pip install numpy pygame pymunk

    pip install iphyre

    Step 2

    Run the script

    Download the script and run the command below (specify your name):

    python collect_play_all.py your_name

    Step 3

    Save the result

    The rewards will be saved in {your_name}.json automatically.

    Basic Games
    Noisy Games
    Compositional Games
    Multi_ball Games