What if we set GPT-4 free in Minecraft?



@nitwit005 4d
I searched for the word "infinite", and my suspicion was quickly proven correct:

> 9) Do not write infinite loops or recursive functions.

> Sometimes GPT-4 will write an infinite loop that runs forever.

@Imnimo 4d
It's always fun to look at the prompts used by these projects. Here are a few snippets from this one:

>You are a helpful assistant that tells me the next immediate task to do in Minecraft. My ultimate goal is to discover as many diverse things as possible, accomplish as many diverse tasks as possible and become the best Minecraft player in the world.

>8) Tasks that require information beyond the player's status to verify should be avoided. For instance, "Placing 4 torches" and "Dig a 2x1x2 hole" are not ideal since they require visual confirmation from the screen. All the placing, building, planting, and trading tasks should be avoided. Do not propose task starting with these keywords

>7) Use `exploreUntil(bot, direction, maxDistance, callback)` when you cannot find something. You should frequently call this before mining blocks or killing mobs. You should select a direction at random every time instead of constantly using (1, 0, 1).

>9) Do not write infinite loops or recursive functions.

You can really imagine the sorts of pitfalls the agent fell into that induced the authors to add these stipulations.

@thesuperbigfrog 3d
At first glance this just seems like an alternate approach at building expert systems.

The Minecraft videos are impressive.

Nethack (https://www.nethack.org/) has been used for AI development in the past and more recently:







I am curious how well Voyager would do in Nethack.

@smcl 3d
That opening sentence is a very funny statement when you take into account that "...in Minecraft" is a way some YouTubers hide hyperbolic/unserious statements (to skirt TOS violations). Like after 100 deaths to Malenia in Elden Ring: "Oh fuck me, I might as well kill myself ... IN MINECRAFT"
@lsy 3d
It's not playing in-context of minecraft, it's playing in-context of an API to minecraft. You can see one of the limitations in its error condition when it tries to craft an "acacia axe" out of acacia planks and sticks, fails, and then replaces all the references to "wooden axe". Of course in the real world it doesn't matter what you call the axe you made, and it's pretty clear what an acacia axe is. Even if it did matter, you could also easily keep the function name and output message, and just make an "wooden axe" behind the scenes. The fact that the GPT is so tightly bound to the formalism of the API is an indication that this is a task the GPT can likely do quite well as this API is well-used and documented.
@slg 3d
And yet it still digs straight down.
@tehsauce 3d
@codeulike 3d
This is like its writing a bot to play minecraft?

I'd like to see a visual/language model/AI that learns to play minecraft as an actual inhabitant of the game. i.e. processing visual input, recognising objects, working out whats going on, learning how to move around. Learning how to make food and avoid monsters. It would be an 'Embodied AI' within the world of Minecraft.

The language part would allow us to talk to this being. You could ask it things like:

"Do you prefer to make a house, or dig a cave?"

"How do you feel, when you hear a monster outside at night?"

@jmugan 3d
I wish there was a summary of how this worked. I see the abstract and lots of figures and movies, but I still don't get a good sense of what exactly the algorithm is. I even skimmed the whole paper.
@Geee 3d
Very interesting. Here LLM writes code that plays Minecraft. If software 2.0 is neural networks, then software 3.0 is code written by neural networks.
@blooalien 3d
Where I think it's gonna start to get really scary, and much more closely approaching "real A.I" / AGI is when they start augmenting and wiring together various differing forms of "A.I." with each other. GPT-4, no matter how impressive it might appear on the surface is still "just" a large language model. Augment it with other types of learning models and at some point you might just hit on the right combination for it to start some form of actual "reasoning" thought or "creativity".

As long as they're all still "special" single-purpose systems (LLM is about processing and responding to language input for example, CV / Computer Vision models specialize in operating on visual or image inputs, etc.), that's all they'll ever be, no matter how good they get at pretending they're more.

@busseio 3d
I’d like to build something like this, but for Robot Odyssey.
@mensetmanusman 3d
It will set out to build a library that contains all the sources that it quotes but that only exist in the multiverse.
@sdenton4 3d
There's untold billions of tokens of Minecraft related content on the web - the controlling LLM personally has memorized every Minecraft strategy guide ever written.
@thdespou 3d
Prompt engineering at it's finest form.