AI’s Three Questions
Before the 2025 World Artificial Intelligence Conference (WAIC2025), three core questions were posed: the mathematical question, the scientific question, and the model question. These questions aim to address the essence of AI development and explore how the new wave of AI revolution will influence the evolution of human civilization.
However, during the exhibition, these questions were simplified into a more straightforward inquiry: Can large models generate a better world?
Understanding Data
The development speed of large models is evidently much faster than human evolution. Xu Li, Chairman and CEO of SenseTime, recalls that in 2012, when AI pioneer Geoffrey Hinton’s team won the ImageNet championship, the scale of machine learning was roughly equivalent to transferring ten years of a person’s knowledge to AI. Fast forward to the era of generative AI, when ChatGPT processes 750 billion tokens, which is equivalent to a natural language creator writing for 100,000 years.
For large models that evolve at such a rapid pace, the issue of “data hunger” is pressing. Large models have nearly covered all publicly available data. It is estimated that by 2027-2028, the natural language data available on the internet may be exhausted. In reality, the speed of language generation does not keep pace with the growth of computational power, leading to a “reverse gap” in large models.
How can we better provide the “oil” of data to support development? The industry has proposed several paths, interestingly, all requiring the participation of large models. One approach is to mine hidden “oil” by seeking data from the private sector in the real world. “We find that traditional enterprises are eager to embrace large models, but their data assets have not been structured,” said Tan Bin, CMO of StarRing Technology. He likens large models to supercars, while companies only have oil fields; the urgent task is to convert those oil fields into high-octane gasoline.
Another bolder idea is for large models to generate data. However, the prerequisite is that such data must be generated based on an understanding of the real world; otherwise, it can easily lead to hallucinations and distortions in world perception. Recently, SenseTime launched the “Enlightenment” world model, primarily used in intelligent driving. Xu Li illustrated this with the example of “cutting in” in smart driving. Collecting data from the real world would be a time-consuming project, but now the world model can generate videos from seven camera angles depicting cutting in, adjusting details like weather, vehicle type, road structure, and speed to create various possible data.
“As model capabilities grow stronger and our understanding of the world deepens, the unity of understanding and generation allows for more interactive possibilities,” Xu Li believes. This approach to solving data issues is shifting from passive to proactive and will assist many industries in progress, providing more possibilities for exploring the real world.
Exploring Efficiency
What is the greatest help that large models provide you? The overwhelming majority of answers are highly consistent—enhancing efficiency. Over the past few years, the rapid efficiency improvements brought by large models have been exhilarating.
During this year’s exhibition, there were numerous groundbreaking cases. At the Baidu booth, a reporter experienced an application called “Miao Da,” which has the slogan “One Sentence to Create an Application.” The reporter input a natural language command: “Please help me design a professional website for distributing Shanghai tourist attractions.” After a few seconds, the large model quickly processed the command, breaking it down into four components: searching for attractions, key sections, design requirements, and functional needs. It then automatically assembled a virtual development team with different skills. The reporter observed as the webpage was being constructed without seeing a single line of code, and within three minutes, the website was completed. “Miao Da” even gave it a catchy name—“Exploring the Magic City of Shanghai.” The webpage included all classic attractions and had features like a message board and booking interface set up.
An engineer who was observing the product told the reporter that building such a website used to take about 40 days, requiring a team of architects, operators, product managers, backend developers, and testers. Now, the efficiency is astonishing. Baidu’s Miao Da brand leader, Zhu Guangxiang, explained that behind “Miao Da” is a combination of “multi-agent collaboration + multi-tool invocation” technology, based on the Wenxin large model, allowing the product to mobilize different domain expert agents according to user commands, achieving incredible efficiency.
At the World Artificial Intelligence Conference, such cases are numerous. For instance, Tongyi Qianwen showcased the AI programming model Qwen3-Coder, which excels in coding capabilities and intelligent agent invocation. A novice programmer can accomplish a week’s worth of work in just one day with its help. The AI verification system from Dewu, which recently won the highest honor, the SAIL Award, has already infiltrated the industry, demonstrating the ability to generate an authentication report for a pair of sneakers in just five seconds.
The Safety Question
When discussing the benevolence of large models, safety is an unavoidable topic. Each year, the World Artificial Intelligence Conference addresses AI safety governance as one of the highest-level issues, as it concerns the future of humanity.
Just before the conference opened, an AI safety international dialogue consensus was released, signed by over 20 industry experts and scholars, including Geoffrey Hinton and Yao Qizhi, calling for increased global investment in AI safety. The signatories generally believe that humanity is at a critical turning point—AI systems are rapidly approaching and may soon surpass human intelligence levels.
Conducting comprehensive AI safety education is an urgent task. At the World Artificial Intelligence Conference, the reporter experienced a “face-swapping” encounter: standing in front of a screen, the system first recorded the real face, and after clicking generate, a few seconds later, a “digital mask” that closely resembled the real person appeared, capable of replicating the facial expressions and movements like head shaking and blinking. However, under the analysis of Hehe Information’s AI face verification model, these “fake faces” were accurately identified.
“The ‘fake faces’ generated in our exhibition interaction device can be produced by current mainstream general large models,” a member of the Hehe Information technology team stated. They emphasized that AI safety issues should not be underestimated, as they are applicable in scenarios like bank identity verification, remote account opening, and large transaction validation. “We hope to remind people to pay attention to AI safety, allowing large models to ‘generate’ a better world rather than pose threats.”
To address challenges such as agent overreach and excessive delegation, Tsinghua University, in collaboration with Ant Group, upgraded the large model safety solution “Ant Tianjian.” This solution is based on the security philosophy of “defense through offense,” constructing a full-process protection system through the “alignment-scanning-defense” technology stack. It is reported that this solution will be co-built with the industry and gradually open-sourced to jointly establish a trustworthy AI ecosystem.
Comments
Discussion is powered by Giscus (GitHub Discussions). Add
repo,repoID,category, andcategoryIDunder[params.comments.giscus]inhugo.tomlusing the values from the Giscus setup tool.