Connect with us

Brand Stories

Artificial intelligence for healthcare: restrained development despite impressive applications | Infectious Diseases of Poverty

Published

on


Artificial intelligence (AI) has avoided the headlines until now, yet it has been with us for 75 years [1, 2]. Still, few understand what it really is and many feel uncomfortable about its rapid growth, with thoughts going back to the computer rebelling against the human crew onboard the spaceship heading out into the infinity of space in Arthur C. Clarke’s visionary novel “2001: a Space Odyssey” [3]. Just as in the novel, there is no way back since the human mind cannot continuously operate at an unwavering level of accuracy or simultaneous interact with different sections of large-scale information (Big Data), areas where AI excels. The World Economic Forum has made a call for a faster adoption of AI in the field of healthcare, a fact discussed at length in a very recent white-paper report [4] arguing that progress is not forthcoming as fast as expected despite the evident potential for growth and innovation at an all-time high and strong demand for new types of computer processors. Among the reasons mentioned for the slow uptake in areas dealing with healthcare are barriers, such as complexity deterring policymakers, and the risk for misaligned technical and strategic decisions due to fragmented regulations [4].

The growing importance of AI in the medical and veterinary fields strengthened by recent articles and editorials published in The Lancet Digital Health and The Lancet [5, 6] underlining actual and potential roles of AI in healthcare. We survey this wide spectrum highlighting current gaps in the understanding of AI and how its application can assist clinical work as well as support and accelerate basic research.

AI technology development

From rules to autonomy

Before elaborating on these issues, some basic informatics about the technology that has moved AI to the fore is in order. In 1968, when both the film and the novel were released, only stationary, primitive computers existed. Rather than undergoing development in the preserve of large companies and academic institutions, they morphed into today’s public laptops, smartphones and wearable sensor networks. The next turn came with the gaming industry’s insatiable need for ultra-rapid action and life-like characters necessitating massively parallel computing, which led to switching from general-purpose, central processor units (CPUs) to specialized graphics processors (GPUs) and tensor processors (TPUs). Fuelled by this expansion of the processor architecture, neural networks, machine learning and elaborate algorithms capable of changing in conjunction with new data (meta-learning) were ushered in, with the rise of the power to understand and respond to human language through generative, pre-trained transformation (GPT) [7] showing the way forward. Breaking out of rule-based computing by the emergent capability of modifying internal settings, adapting to new information and understanding changing environments put these flexible systems, now referred to as AI, in the fast lane towards domains requiring high-level functionality. Computer systems adapted to a wide range of tasks, for which they were not explicitly programmed, could then be developed and launched into the public area as exemplified by automated industrial production, self-driving vehicles, virtual assistants and chatbots. Although lacking the imagination and versatility that characterize the human mind, AI can indeed perform tasks partly based on reasoning and planning that typically require human cognitive functions, and with enhanced efficiency and productivity.

Agent-based AI

Here, the agent is any entity that can perceive its environment, make decisions and act toward some goal, where rule-based AI has been replaced with proactive interaction. Agent-based AI generally uses many agents working separately to solve joint problems or even collaborating like a team. This approach was popularized by Wooldridge and Jennings in the 1990s, who described decentralized, autonomous AI systems capable of ‘meta-learning’ [8]. They felt that outside targets can be in sanitated and dealt with as computational objects, a methodology that has advanced the study of polarization, traffic flow, spread of disease, and similar phenomena. Although technology did not catch up with this vision until much later, AI today encompasses a vital area of active research producing powerful tools for simulating complex distributed and adaptive systems. The great potential of this approach for disease distributions and transmission dynamics may provide the insights needed to successfully control the neglected tropical diseases (NTDs) as well as dealing with other challenges in the geospatial health sphere [9]. The Internet of Things (IoT) [10], another example agent-based AI, represents the convergence of embedded sensors and software enabling collection and exchanging data with other devices and systems; however, operations are often local and do not necessarily involve the Internet.

While the rule-based method follows a set of rules and therefore produces an outcome which is to some degree predictable, the two new components in the agent-based approach include the capability of learning from experience and testing various outcomes by one or several models. This introduces a level of reasoning, which allows for non-human choice, as schematically shown in Fig. 1.

Fig. 1

The research schemes of two AI’s approaches including Rule-based AI or Agent-based AI (AI refers artificial intelligence)

AI applications

Clinical applications

Contrary to common belief, a diagnostic program that today would be sorted under the heading AI was designed already 50 years ago at Stanford University, California, United States of America. The system, called MYCIN [11], was aimed to assist physicians with regard to bacterial blood infections. It was originally produced in book format, utilized a knowledge base of approximately 600 rules and operated through a series of questions to the user ultimately providing diagnosis and treatment recommendation. In the United States, similar approaches aimed at the diagnoses of bacterial infections appeared in the following decades but were not often used due to lack of computational power at the time. Today, on the other hand, this is no longer the limiting factor and AI is revolutionizing image-based diagnostics. In addition to the extensive use of AI-powered microscopy in parasitology, the spectrum includes both microscopic differentiation between healthy and cancerous tissue in microscope sections [12], as well as interpretations of graphs and videos from electrocardiography (EKG) [13], computer tomography (CT) [14, 15], magnet resonance imaging (MRI) [15] and ultrasonography [16]

Some AI-based companies are doing well, e.g., ACL Digital (https://www.acldigital.com/) that analyzes data from wearable sensors detecting heart arrhythmias, hypertension, sleep disorders; AIdoc (https://www.aidoc.com/eu/) whose platform evaluates clinical examinations and coordinates workflows beyond diagnosis; and the da Vinci Surgical System (https://en.wikipedia.org/wiki/Da_Vinci_Surgical_System), which has been used for various interventions, including kidney and hysterectiomy [17, 18]. However, others have failed, e.g., ‘Watson for Oncology’, launched by IBM for cancer diagnosis and optimized chemotherapy (https://www.henricodolfing.com/2024/12/case-study-ibm-watson-for-oncology-failure.html) and Babylon Health (https://en.wikipedia.org/wiki/Babylon_Health), a tele-health service that connected people to doctors via video calls, offered wholesale health promotion with high precision and virtual health assistants (Chatbots) that even remind patients to take medication. These final examples of AI-assisted medicine show that strong regulation is needed before this kind of assistance can be released for public use.

Basic research

The focus in the 2024 Nobel ceremony granted AI a central role: while the Physics Prize was awarded for the development of associative neural networks, the Chemistry Prize honored the breakthrough findings regarding how strings of amino acids fold into particular shapes [19]. This thorny problem was cracked by AlphaFold2, a robot based on deep-learning developed at DeepMind, a company that now belongs to Google’s parent Alphabet Inc. The finding that all proteins share the same folding process widened the research scope making it possible to design novel proteins with specific functions (synthetic biology), accelerate drug discovery and shed light on how diseases arise through mutations. The team that created this robot as its current sight on finding out how proteins interact with the rest of the cellular machinery. AlphaFold3, an updated version of the architecture generates accurate, three-dimensional molecular structures by pair-wise interaction between molecular components, which can be used to model how specific proteins work in union with other cell components exposing the details of protein interaction. These new applications highlight the exponential rise of AI’s significance for research in general and for medicine in particular.

The solution to the protein-folding problem not only reflects the importance of the training component but also demonstrates that AI is not as restricted as the human mind is when it comes to large realms of information (Big Data), which is needed for a large number of activities in modern society, such as autonomous driving, large-scale financial transactions as dealt with in banks on a daily basis. Big Data is common also in healthcare and it involves not only when dealing with hospital management and patient records, but also with large-sale diagnostic approaches. An academic paper, co-authored with clinicians and Google Research, investigated the reliability of diagnostic AI system, finding that machine learning reduced the number of false positives in a large mammography dataset by 25% (and also reached conclusions considerably faster), compared with the standard, clinical workflow without missing any true positives [20], a reassuring result.

Epidemiological surveillance

AI tools have been widely applied in epidemiological surveillance of vector-borne diseases. Due to vectors’ sensitivity to temperature and precipitation, the arthropod vectors are bellwether indicators, not only for the diseases they often carry but also for climate change. By gaining deeper insights into the complex interactions between climate, ecosystems and parasitic diseases with intricate life cycles, AI technologies assist by handling Big Data and even using reasoning to deal with obscure variations and interactions of climate and biological variables. To keep abreast of this situation, the connections between human, animal and environmental health not only demand data-sharing at the local level but also nationally and globally. This move towards the One Health/Planetary Health approach is highly desirable, and AI will unquestionably be needed for sustaining diligence with respect to the Big Data repositories required for accurate predictions of disease transmission, while AI-driven platforms can further facilitate real-time information exchange between stakeholders, optimize energy consumption and improve resource management for infections in animals and humans, in particular with regard to parasitic infections [21]. Proactive synergies between public health and other disciplines, such as ecology, genomics, proteomics, bioinformatics, sanitary engineering and socio-economy make the future medical agenda not only exciting and challenging, but also highly relevant globally.

In epidemiology, there has been a strong advance across the fields of medical and veterinary sciences [22], while previously overlooked events and unusual patterns now stand a better chance of being picked up by AI analysis of indirect methods, e.g., phone tracing, social media posts, news articles and health records. Technically less complex, but no less innovative operations are required to update the roadmap for elimination of the NTDs issued by the World Health Organization (WHO) [23]. The Expanded Special Project for the Elimination of Neglected Tropical Diseases (ESPEN) is a collaborative effort between the WHO regional office for Africa, member states and NTD partners. Its portal [24] offers visualization and planning tools based on satellite-generated imagery, climate data and historical disease patterns that are likely to identify high-risk areas for targeted interventions and allocate resources effectively. In this way, WHO’s roadmap for NTD elimination is becoming more data-driven, precise and scalable, thereby accelerating progress.

The publication records

Established as far back as 1993, Artificial Intelligence Research was the first journal specifically focused on AI, soon followed by an avalanche of similar ones (https://www.scimagojr.com/journalrank.php?category=1702). China, India and United States are particularly active in AI-related research. According to the Artificial Intelligence Index Report 2024 [25], the total number of general AI publications had risen from approximately 88,000 in 2010 to more than 240,000 in 2022, with publications on machine learning increasing nearly sevenfold since 2015. If also conference papers and repository publications (such as arXiv) are included along with papers in both English and Chinese, the number rises to 900,000, with the great majority originating in China [26].

A literature search based solely on PubMed, carried out by the end of 2024 by us using “AI and infectious disease(s)” as search term resulted in close to 100,000 entries, while the term “Advanced AI and infectious disease(s)” only resulted in about 6600. The idea was to find the distintion between simpler, more rule-based applications and proper AI. Naturally, the results of this kind can be grossly misleading as information on the exact type of computer processor used, be it CPU, GPU or TPU, is generally absent and can only be inferred. Nevertheless, the much lower figure for “Advanced AI and infectious disease(s)” is an indication of the preference for less complex AI applications so far, i.e. work including spatial statistics and comparisons between various sets of variables vis-à-vis diseases, aiming at estimating distributions, hotspots, vector breeding sites, etc.

With as many as 100,000 medical publications found in the PubMed search, they clearly dominate in relation to the total of more than 240,000 AI-assisted research papers found up to 2022 [25]. The growing importance of this field is further strengthened by recent articles and editorials [27, 6]. Part of this interest is probably due to the wide spectrum of the medical and veterinary fields and AI’s potential in tracing and signalling disease outbreaks plus its growing role in surveillance that has led to a surge of publications on machine learning, offering innovative solutions to some of the most pressing challenges facing health research today [28].



Source link

Brand Stories

‘You can make really good stuff – fast’: new AI tools a gamechanger for film-makers | Artificial intelligence (AI)

Published

on


A US stealth bomber flies across a darkening sky towards Iran. Meanwhile, in Tehran a solitary woman feeds stray cats amid rubble from recent Israeli airstrikes.

To the uninitiated viewer, this could be a cinematic retelling of a geopolitical crisis that unfolded barely weeks ago – hastily shot on location, somewhere in the Middle East.

However, despite its polished production look, it wasn’t shot anywhere, there is no location, and the woman feeding stray cats is no actor – she doesn’t exist.

Midnight Drop, an AI film depicting US-Israeli bombings in Iran

The engrossing footage is the “rough cut” of a 12-minute short film about last month’s US attack on Iranian nuclear sites, made by the directors Samir Mallal and Bouha Kazmi. It is also made entirely by artificial intelligence.

The clip is based on a detail the film-makers read in news coverage of the US bombings – a woman who walked the empty streets of Tehran feeding stray cats. Armed with the information, they have been able to make a sequence that looks as if it could have been created by a Hollywood director.

The impressive speed and, for some, worrying ease with which films of this kind can be made has not been lost on broadcasting experts.

Last week Richard Osman, the TV producer and bestselling author, said that an era of entertainment industry history had ended and a new one had begun – all because Google has released a new AI video making tool used by Mallal and others.

A still from Midnight Drop, showing the woman who feeds stray cats in Tehran in the dead of night. Photograph: Oneday Studios

“So I saw this thing and I thought, ‘well, OK that’s the end of one part of entertainment history and the beginning of another’,” he said on The Rest is Entertainment podcast.

Osman added: “TikTok, ads, trailers – anything like that – I will say will be majority AI-assisted by 2027.”

For Mallal, a award-winning London-based documentary maker who has made adverts for Samsung and Coca-Cola, AI has provided him with a new format – “cinematic news”.

The Tehran film, called Midnight Drop, is a follow-up to Spiders in the Sky, a recreation of a Ukrainian drone attack on Russian bombers in June.

Within two weeks, Mallal, who directed Spiders in the Sky on his own, was able to make a film about the Ukraine attack that would have cost millions – and would have taken at least two years including development – to make pre-AI.

“Using AI, it should be possible to make things that we’ve never seen before,” he said. “We’ve never seen a cinematic news piece before turned around in two weeks. We’ve never seen a thriller based on the news made in two weeks.”

Spiders in the Sky was largely made with Veo3, an AI video generation model developed by Google, and other AI tools. The voiceover, script and music were not created by AI, although ChatGPT helped Mallal edit a lengthy interview with a drone operator that formed the film’s narrative spine.

Film-maker recreates Ukrainian drone attack on Russia using AI in Spiders in the Sky

Google’s film-making tool, Flow, is powered by Veo3. It also creates speech, sound effects and background noise. Since its release in May, the impact of the tool on YouTube – also owned by Google – and social media in general has been marked. As Marina Hyde, Osman’s podcast partner, said last week: “The proliferation is extraordinary.”

Quite a lot of it is “slop” – the term for AI-generated nonsense – although the Olympic diving dogs have a compelling quality.

Mallal and Kazmi aim to complete the film, which will intercut the Iranian’s story with the stealth bomber mission and will be six times the length of Spider’s two minutes, in August. It is being made by a mix of models including Veo3, OpenAI’s Sora and Midjourney.

“I’m trying to prove a point,” says Mallal. “Which is that you can make really good stuff at a high level – but fast, at the speed of culture. Hollywood, especially, moves incredibly slowly.”

skip past newsletter promotion

Spiders in the Sky, an AI film directed by Samir Mallal, tells the story of Ukraine’s drone attacks on Russian airfields. Photograph: Oneday Studios

He adds: “The creative process is all about making bad stuff to get to the good stuff. We have the best bad ideas faster. But the process is accelerated with AI.”

Mallal and Kazmi also recently made Atlas, Interrupted, a short film about the 3I/Atlas comet, another recent news event, that has appeared on the BBC.

David Jones, the chief executive of Brandtech Group, an advertising startup using generative AI – the term for tools such as chatbots and video generators – to create marketing campaigns, says the advertising world is about to undergo a revolution due to models such as Veo3.

“Today, less than 1% of all brand content is created using gen AI. It will be 100% that is fully or partly created using gen AI,” he says.

Netflix also revealed last week that it used AI in one of its TV shows for the first time.

A Ukrainian drone homes in on its target in Spiders in the Sky. Photograph: Oneday Studios

However, in the background of this latest surge in AI-spurred creativity lies the issue of copyright. In the UK, the creative industries are furious about government proposals to let models be trained on copyright-protected work without seeking the owner’s permission – unless the owner opts out of the process.

Mallal says he wants to see a “broadly accessible and easy-to-use programme where artists are compensated for their work”.

Beeban Kidron, a cross-bench peer and leading campaigner against the government proposals, says AI film-making tools are “fantastic” but “at what point are they going to realise that these tools are literally built on the work of creators?” She adds: “Creators need equity in the new system or we lose something precious.”

YouTube says its terms and conditions allow Google to use creators’ work for making AI models – and denies that all of YouTube’s inventory has been used to train its models.

Mallal calls his use of AI to make films “prompt craft”, a phrase that uses the term for giving instructions to AI systems. When making the Ukraine film, he says he was amazed at how quickly a camera angle or lighting tone could be adjusted with a few taps on a keyboard.

“I’m deep into AI. I’ve learned how to prompt engineer. I’ve learned how to translate my skills as a director into prompting. But I’ve never produced anything creative from that. Then Veo3 comes out, and I said, ‘OK, finally, we’re here.’”



Source link

Continue Reading

Brand Stories

AI’s next leap demands a computing revolution

Published

on


We stand at a technological crossroads remarkably similar to the early 2000s, when the internet’s explosive growth outpaced existing infrastructure capabilities. Just as dial-up connections couldn’t support the emerging digital economy, today’s classical computing systems are hitting fundamental limits that will constrain AI’s continued evolution. The solution lies in quantum computing – and the next five to six years will determine whether we successfully navigate this crucial transition.

The computational ceiling blocking AI advancement

Current AI systems face insurmountable mathematical barriers that mirror the bandwidth bottlenecks of early internet infrastructure. Training large language models like GPT-3 consumes 1,300 megawatt-hours of electricity, while classical optimization problems require exponentially increasing computational resources. Google’s recent demonstration starkly illustrates this divide: their Willow quantum processor completed calculations in five minutes that would take classical supercomputers 10 septillion years – while consuming 30,000 times less energy.

The parallels to early 2000s telecommunications are striking. Then, streaming video, cloud computing, and e-commerce demanded faster data speeds that existing infrastructure couldn’t provide. Today, AI applications like real-time molecular simulation, financial risk optimization, and large-scale pattern recognition are pushing against the physical limits of classical computing architectures. Just as the internet required fiber optic cables and broadband infrastructure, AI’s next phase demands quantum computational capabilities.

Breakthrough momentum accelerating toward mainstream adoption

The quantum computing landscape has undergone transformative changes in 2024-2025 that signal mainstream viability. Google’s Willow chip achieved below-threshold error correction – a critical milestone where quantum systems become more accurate as they scale up. IBM’s roadmap targets 200 logical qubits by 2029, while Microsoft’s topological qubit breakthrough promises inherent error resistance. These aren’t incremental improvements; they represent fundamental advances that make practical quantum-AI systems feasible.

Industry investments reflect this transition from research to commercial reality. Quantum startups raised $2 billion in 2024, representing a 138 per cent increase from the previous year. Major corporations are backing this confidence with substantial commitments: IBM’s $30 billion quantum R&D investment, Microsoft’s quantum-ready initiative for 2025, and Google’s $5 million quantum applications prize. The market consensus projects quantum computing revenue will exceed $1 billion in 2025 and reach $28-72 billion by 2035.

Expert consensus on the five-year transformation window

Leading quantum computing experts across multiple organizations align on a remarkably consistent timeline. IBM’s CEO predicts quantum advantage demonstrations by 2026, while Google targets useful quantum computers by 2029. Quantinuum’s roadmap promises universal fault-tolerant quantum computing by 2030. IonQ projects commercial quantum advantages in machine learning by 2027. This convergence suggests the 2025-2030 period will be as pivotal for quantum computing as 1995-2000 was for internet adoption.

The technical indicators support these projections. Current quantum systems achieve 99.9 per cent gate fidelity – crossing the threshold for practical applications. Multiple companies have demonstrated quantum advantages in specific domains: JPMorgan and Amazon reduced portfolio optimization problems by 80 per cent, while quantum-enhanced traffic optimization decreased Beijing congestion by 20 per cent. These proof-of-concept successes mirror the early internet’s transformative applications before widespread adoption.

Real-world quantum-AI applications emerging across industries

The most compelling evidence comes from actual deployments showing measurable improvements. Cleveland Clinic and IBM launched a dedicated healthcare quantum computer for protein interaction modeling in cancer research. Pfizer partnered with IBM for quantum molecular modeling in drug discovery. DHL optimized international shipping routes using quantum algorithms, reducing delivery times by 20 per cent.

These applications demonstrate quantum computing’s unique ability to solve problems that scale exponentially with classical approaches. Quantum systems process multiple possibilities simultaneously through superposition, enabling breakthrough capabilities in optimization, simulation, and machine learning that classical computers cannot replicate efficiently. The energy efficiency advantages are equally dramatic – quantum systems achieve 3-4 orders of magnitude better energy consumption for specific computational tasks.

The security imperative driving quantum adoption

Beyond performance advantages, quantum computing addresses critical security challenges that will force rapid adoption. Current encryption methods protecting AI systems will become vulnerable to quantum attacks within this decade. The US government has mandated federal agencies transition to quantum-safe cryptography, while NIST released new post-quantum encryption standards in 2024. Organizations face a “harvest now, decrypt later” threat where adversaries collect encrypted data today for future quantum decryption.

This security imperative creates unavoidable pressure for quantum adoption. Satellite-based quantum communication networks are already operational, with China’s quantum network spanning 12,000 kilometers and similar projects launching globally. The intersection of quantum security and AI protection will drive widespread infrastructure upgrades in the coming years.

Preparing for the quantum era transformation

The evidence overwhelmingly suggests we’re approaching a technological inflection point where quantum computing transitions from experimental curiosity to essential infrastructure. Just as businesses that failed to adapt to internet connectivity fell behind in the early 2000s, organizations that ignore quantum computing risk losing competitive advantage in the AI-driven economy.

The quantum revolution isn’t coming- it’s here. The next five to six years will determine which organizations successfully navigate this transition and which turn into casualties of technological change. AI systems must re-engineer themselves to leverage quantum capabilities, requiring new algorithms, architectures, and approaches that blend quantum and classical computing.

This represents more than incremental improvement; it’s a fundamental paradigm shift that will reshape how we approach computation, security, and artificial intelligence. The question isn’t whether quantum computing will transform AI – it’s whether we’ll be ready for the transformation.

(Krishna Kumar is a technology explorer & strategist based in Austin, Texas in the US. Rakshitha Reddy is AI developer based in Atlanta, US)



Source link

Continue Reading

Brand Stories

OpenAI’s artificial intelligence (AI) model has achieved a monumental achievement in catching up wit..

Published

on

By


Using a closed inference model under experimentation in my department, I scored 35 out of 42 points in IMO 2025 and have struggled with AI for a long time given the same time as humans…10 to 30% accuracy for other models

The closed inference giant language model (LLM), which OpenAI is experimenting internally, has achieved the equivalent of a gold medal at the International Mathematical Olympiad (IMO). [Source = Alexander OpenAI Research Scientist X]

OpenAI’s artificial intelligence (AI) model has achieved a monumental achievement in catching up with human intelligence at the International Mathematical Olympiad.

In other words, OpenAI has shown its robustness with overwhelming AI performance at a time when rumors of a crisis are emerging due to the failure to acquire start-up Windsurf and the subsequent outflow of talent.

“We achieved gold medal-level performance at the International Mathematical Olympiad (IMO 2025) this year with a universal inference model,” OpenAI CEO Sam Altman said on his X(X) on the 19th (local time).

“When I first started OpenAI, it was a dream story. This is an important indicator of how advanced AI has developed over the past decade, he said, explaining the significance of this achievement.

The results are based on the Large Language Model for Inference (LLM), which is being experimented internally by a small team led by Alexander Way, a research scientist at OpenAI.

IMO is a prestigious Olympiad that has been underway since 1959, and students under the age of 20 representing each country participate. It is characterized by requiring mathematical thinking and creative ideas, not just problems that can be solved by memorizing formulas.

According to OpenAI, the test was conducted under the same conditions as human candidates. IMO, which consists of a total of 6 questions, is a method of solving 3 questions for 4 hours and 30 minutes a day over 2 days.

OpenAI’s model scored 35 points out of 42 while solving 5 out of 6 questions, setting a record equivalent to the gold medal spot.

In this year’s IMO, humans still performed better than AI with six perfect scores, but it is evaluated as a symbolic event that shows how close LLM’s rapidly developing performance is to human intelligence.

LLMs so far have hardly reached the silver and bronze medals in IMO, let alone the gold medal.

Google DeepMind’s “AlphaProof” and “AlphaGeometry 2” scored silver medals last year, but these models were specialized only in the math field.

Noam Brown, an open AI research scientist, said, “The achievements that AI has previously shown in Go and poker are the result of researchers training AI to master only that specific area for years. However, this model is not an IMO-specific model, but an inference LLM that combines a new method of experimental general-purpose technology.”

According to MathArena of the Federal Institute of Technology (ETH Zurich), Switzerland, which tracks the mathematical performance of major models, all of the world’s leading models such as Google’s Gemini 2.5 Pro, xAI’s Grok 4 and DeepSeek’s R1 did not even make it to the bronze medal list at this year’s IMO 2025.

The specific secret of OpenAI’s breakthrough achievement has not been disclosed.

“We have developed a new technology that enables LLM to perform much better tasks that are difficult to verify,” Brown said. “O1 (existing inference model) thinks for seconds, and ‘deep research’ functions take minutes, but this model thinks for hours.”

Meanwhile, this result is not a third-party verification state as it is conducted as a closed-door experimental model that has not been officially released by OpenAI. Mass Arena said in response, “We are excited to see steep progress in this area and look forward to the launch of the model, enabling independent evaluation through public benchmarks.”



Source link

Continue Reading

Trending

Copyright © 2025 AISTORIZ. For enquiries email at prompt@travelstoriz.com