Claude is not a great Pokemon player, and this is good

April 10, 2025

if Claude plays Pokemon It is supposed to give a glimpse of the future of artificial intelligence, it is not a very convincing offer. During the past month and count Red Pokemon. Through several runs, Claude failed to overcome the game of approximately 30 years old. So far for David Hershey, the main project developer, the offer has been successful.

“I wanted a place where I could understand how Claude dealt with the situations in which he needs to work over a very long period,” explains to me Hershey about a video call. As part of his daily function in humans, Hershey works on the team to go to the market where the company’s customers help to create their own agents (more about that in a moment). Work began for the first time on Claude playing Pokemon as a side project soon 3.5 Sonata Last June.

You can also guess the name, the project was partially inspired Nashl play PokemonWhich first appeared in 2014 and witnessed 1.16 million posts in an attempt to outlook Red Pokemon Using viewers of the inputs written only in the broadcast box. Hershey was not the first Angroprie employee to try to formulate Claude in the Pokemon League champion, but the project took his own life close to the participation.

In the first days of the project, it was great when Claude managed to leave Red’s house and find Professor Ok. “I spent some evil hours in tampering to make him achieve this kind of progress,” says Hershey. He used to update his colleagues in Claude on the internal recession channel. At this point, most of the company was not interested, and it was not planned to share with the world.

However, Hershey made it customary to reconsider the project with every issuance of new main models of anthropology Claude 3.5 Sonata last fall And again recently with 3.7 Sonnet. “It is the way I go to see,” What is this new model? “How does?” “What can I learn about it?”

Inside the Anthrop, the hope was that Claude would become better in trying different strategies and adjusting his approach when things did not go according to the plan. with Red Pokemon, The company saw Claude to do these things in an actual time. “(Claude 3.7 Sonnet) spends less time in the assumptions,” says Hershey. “You will still see that he is guessing, then you spend a number of hours in the belief that this is true and take stupid decisions in the meantime, but the previous models will continue to do so forever.”

Antrhopic

Literally, you can see Claude developing and running with these assumptions. Each slow step in the game precedes a paragraph of the text from the artificial intelligence – “I faced wild beverages while moving to (24,24). According to my strategy, I must escape from this battle to preserving resources” – followed by one button printing press. Then he re -evaluates the condition of the game and does so again.

If you are watching Claude, you are wrinkled Red Pokemon As a game fan, a model appears to be “less stuck in assumptions” slight, especially when Chatbot frequently stumbles in areas such as Veridian forests, and sometimes for several days, due to the design level design. However, it is a prominent sign of the type of artificial intelligence system represented by Claude 3.7.

Like many modern Frontier AI systems, Claude 3.7 Sonnet is a model of logic, which means that it is designed to address problems by dividing them into smaller pieces. “Many of our customers are interested in the effectiveness of Claude,” Hershey explains. For beginners, AIS agents or AIS agent They are systems designed to plan and carry out complex tasks without human supervision. At the present time, most people think about artificial intelligence as an empty chat box pending the answer to a question, but the struggles are only the consumer face of the industry; The agents are a gradual but important step towards the promise of artificial general intelligence.

From this perspective, there are some things that make Claude playing Pokemon is interesting. First, there is an amazing fact, Hershey delegated a lot of programming that made the project possible Anthropier -coding agent Including an overburden that allows Claude to understand Pokemon Reed Game world.

Second, and most importantly, Claude was not pre -played Red Pokemon. Chatbot knows some of the basics about the game, such as the name of each gym captain and the system that the player must overcome, but he does not have hundreds of years of knowing the game like some Specialized artificial intelligence systems. “You can throw a model in a game without preparation, no instructions, and you can learn everything itself,” he says. “I aim to be the closest to this aspect.”

Hershey had to give Claude some help. I have already mentioned the extent that allows him to explain Pokemon Reed Interface. Pixel art is something that fights all artificial intelligence systems, and 3.7 Sonite is not expecting. As human beings, our imagination does a great job in filling the details that you suggested a few pixels. What’s more, no “sees” Claude the way we do.

If you see this closely, you will notice every time he moves the player’s personality, he will make some inputs before re -evaluating his site. Between these tires, Claude is not any sensory inputs. Red walking cannot be seen, nor “he hears” when its inputs cause a tree or another obstacle. The “weak vision” Claude is one of the main reasons that it fights with the game; In fact, Hershey had to give Chatbot a way to read the game memory, so he was likely to link it if the screen interpretation was offended.

If the project’s goal is Claude to overcome Red PokemonIt was easy. Hershey could have been programmed through the game to follow Chatbot, but at this point, all that would have been testing was the quality of Claude for a solid set of instructions. “Claude is very good in it,” says Hershey. “I knew that. We all knew that.”

Instead, in leaving Claude to its own devices, the new model showed that it is better to plan, reach new strategies and eventually try something different when his assumptions are wrong. One of the most New solutions Claude evolved during her third race through the game was to deliberately cause Pokemon to fainting so that he could escape from Mount Moon.

However, Claude can be much better in both short and long -term planning. In the same example, she just mentioned, Claude deleted all her observations on Mount Moon after breathing in the nearby Bokimon center, incorrectly believing that she succeeded in moving in the cave. One of her promising runs has finished after Claude failed to admit that she needed to speak to Bill to advance in the game. I stumbled in an endless episode of making bad decisions.

“Moving forward, I don’t know how useful it is internally as a standard. It is possible that with a small group of small skills, Claude gets a little better and the game is seized, and then the standard is not interesting,” Hershey admits. “It may be that there are things that I do not fully understand about what will make our next model a good, and then we will continue to learn a lot of additional things along the way.”

As for what happens after that, Hershey says he does not have a long -term strategy to play Claude Pokemon. “I spent a lot of time – my wife will say a lot of time – stare in this thing,” says laughing. I also get Hershey’s feeling not quite ready to close the book in the project. “I would like to imagine whenever a new model appears, I will play Pokemon with him, and I will show the world that too.”

Until then, Anthropor continues, after a recent reset, in the broadcast of Claude Pokemon plays on Nashil. The project was successful enough to inspire an independent developer for the program A. Gemini plays Pokemon Watch, and if I have to guess, we will see more imitators before a long time.

This article was originally appeared on Engadget on https://www.engadget.com/ai/claude-isnt-a- Great-Pokemon-player-and-that-okay-151522448.html? SRC = RSS?

Source link

Claude is not a great Pokemon player, and this is good

EDITOR PICKS

More than 2000 echo and ultrasound at Salama Hospital, especially to relieve waiting lists

The Ministry of Mexico Foreign Affairs is launching the Aboriginal languages initiative

3 new episodes miniatures for lovers on real events: mysterious disappearance

557: Lass the phone box