AI in Travel

Travel AI is fragile—can adversarial training fix that?

Published

3 weeks ago

June 27, 2025

As artificial intelligence (AI) marches
toward general-purpose capability, travel remains both tantalized and trapped.
On one hand, AI promises frictionless journeys, hyper-personalized offers and
operational efficiency. On the other, the sector’s stubborn legacy
infrastructure and data fragmentation leave even the most powerful large
language models (LLMs) hallucinating under pressure.

This isn’t just a scaling issue—it’s a
training issue. The time has come for the travel industry to make adversarial training
(also known as adversarial deep learning) a core requirement, not just a
curiosity.

What is adversarial training and why does it matter?

Adversarial training uses deliberately
crafted “edge case” inputs—scenarios that push the model into ambiguity, error or
confusion—to strengthen performance. These examples aren’t noise; they’re
designed to probe blind spots, force corrections and ultimately make the model
more robust, especially in high-stakes decision-making. The use of edge cases
to find failures is not new. But AI takes that process to a higher level and
faster. That is better for complex environments like travel.

In other fields, for example in
medicine, Google’s DeepMind uses adversarial examples to refine AI diagnostic
reasoning. In finance, JPMorgan has tested similar frameworks to guard against
risky generative outputs.

OpenAI incorporated red-team adversarial prompts into
GPT-4’s release to catch hallucination-prone use cases. Even Microsoft’s GitHub
Copilot uses this technique to identify corner-case bugs before they reach
production.

But in travel? There is almost no
formal adoption of adversarial training—despite travel’s uniquely complex,
interdependent systems. Er…can I get a trip please?

Why travel is an adversarial minefield

Travel isn’t just another vertical.
It’s a web of exceptions and irregularities masquerading as rules. As we have
seen in the journey to offer and order, this complexity and process has proved
to be unnecessarily slow and overly complex. The result is it seems the
journey is not moving fast enough.

We have to remember that the
uniqueness and legacy processes abound. Every itinerary depends on dynamic
pricing, fragmented inventory, overlapping regulatory regimes and constraints
that are embedded in these legacy frameworks. They should ONLY be constrained
by true dynamic market forces. To be clear they are not today, and that horizon
seems to be pretty far away.

There are global schemas that capture
this to enable such things as a single itinerary from Nairobi to Sydney, which touches
four continents, five regulatory zones [and involves] interline or codeshare
logic. But is that necessary for a London to Rome single flight, and if so,
why?

We know much of the critical data
sits inside proprietary silos—global distribution systems, airline reservation
systems and loyalty platforms that interoperate on many of these arcane rules
defined decades ago. Heck, we are still using steamship analogies. It is time
for a change. We have become scared to sweep this away because we are afraid of
these embedded legacy edge cases.

Adversarial training is made for this

Instead of pretending AI can infer all
this from context or hope, we should deliberately stress-test models against
these edge cases: open-jaw tickets, split passenger name records, nested fare
rules, denied boarding rules on mixed carriers.

By feeding those back into
training, we create AI that is not only fluent—but trustworthy. Now, that is
something that has eluded airlines for decades, as they are protected through
industry norms and rules that can make anyone’s head spin.

Are travel companies doing this yet?

Reports suggest:

Amadeus has begun internal tests of LLMs
for agent-assist workflows in call centers and B2B servicing environments,
where accuracy and recall of fare rules are critical. Though not labeled
as adversarial training, these quality assurance processes simulate many
of the same effects by injecting structured edge-case scenarios during
model evaluation.
Hopper, Google Travel and others
have also seen firsthand the cost of AI hallucinations in production—where
bots have invented prices or incorrectly interpreted refundability. These
incidents underscore the urgent need for AI stress testing frameworks.

The performance problem is just as real

It’s not just about factual
hallucinations. Many travel models today fail silently—timing out, stalling on
long itineraries or choking on ambiguous prompts. Adversarial input techniques
can expose these issues, allowing developers to tune memory allocation,
contextual threading or backend dependencies.

This kind of real-world “load testing”
is essential for multi-turn travel planning, especially as we move toward agentic
AI models that autonomously book, rebook or manage travel end-to-end.

A personal note: Why edge cases matter

As the founder of Air Black Box (ABB),
one of the first platforms to enable true open interlining across independent
airlines, I learned firsthand how edge-case thinking can force an industry
forward.

Prior to the deployment of ABB’s
patented solutions, the status quo was joint ventures or tightly constrained
interlining—models that left out regional, low-cost or startup carriers. We
built a capability to connect them anyway, precisely because the infrastructure
didn’t support it.

Today, the same mindset applies. If we
want AI to truly serve the needs of travelers and not just mimic travel agents
poorly, we have to confront complexity head-on. That starts with adversarial
training—not as a patch but as a strategic method.

Final thought: Don’t build on sand

If your AI roadmap doesn’t include
adversarial training, you’re relying on the hope that your model will “just
know better.” But in travel, hallucinations aren’t just embarrassing—it’s
operationally dangerous. And they do happen.

The future of AI in travel depends not
on bigger models but on smarter training. Adversarial deep learning is
the stress test this industry needs.

Let’s stop waiting. We can get it
right this time: faster, more reliably and less costly.

About the author…

Timothy O’Neil-Dunne is principal of Seattle-based travel and aviation consultancy T2Impact.

Source link

Up Next

In India’s $23 bn online travel market, Ixigo turns to AI and cricket to gain ground

Don't Miss

AI to Reinvent the Future of Travel

the Editor

AI in Travel

India’s Travel Revolution: How Map My Tour is Transforming Tourism with AI-Powered Personalization in New Delhi and Beyond – Travel And Tour World

Published

19 minutes ago

July 18, 2025

the Editor

India’s Travel Revolution: How Map My Tour is Transforming Tourism with AI-Powered Personalization in New Delhi and Beyond Travel And Tour World

Source link

AI in Travel

OpenAI Rolls Out ChatGPT Agent Combining Deep Research and Operator

Published

11 hours ago

July 17, 2025

Siddharth Jindal

OpenAI has launched the ChatGPT agent, a new feature that allows ChatGPT to act independently using its own virtual computer. The agent can navigate websites, run code, analyse data, and complete tasks such as planning meetings, building slideshows, and updating spreadsheets.

The feature is now rolling out to Pro, Plus, and Team users, with access for Enterprise and Education users expected in the coming weeks.

The agent integrates previously separate features like Operator and Deep Research, combining their capabilities into a single system. Operator allowed web interaction through clicks and inputs, while deep research focused on synthesis and summarisation.

The new system allows fluid transition between reasoning and action in a single conversation.

“You can use it to effortlessly plan and book travel itineraries, design and book entire dinner parties, or find specialists and schedule appointments,” OpenAI said in a statement. “ChatGPT requests permission before taking actions of consequence, and you can easily interrupt, take over the browser, or stop tasks at any point.”

Users can activate agent mode via the tools dropdown in ChatGPT’s composer window. The agent uses a suite of tools, including a visual browser, a text-based browser, terminal access, and API integration. It can also work with connectors like Gmail and GitHub, provided users log in via a secure takeover mode.

All tasks are carried out on a virtual machine that preserves state across tool switches. This allows ChatGPT to browse the web, download files, run commands, and review outputs, all within a single session. Users can interrupt or redirect tasks at any time without losing progress.

ChatGPT agent is currently limited to 400 messages per month for Pro users and 40 for Plus and Team users. Additional usage is available through credit-based options. Support for the European Economic Area and Switzerland is in progress.

The standalone Operator research preview will be phased out in the coming weeks. Users who prefer longer-form, slower responses can still access deep research mode via the dropdown menu.

While slideshow generation is available, OpenAI noted that formatting may be inconsistent, and export issues remain. Improvements to this capability are under development.

The system showed strong performance across benchmarks. On Humanity’s Last Exam, it scored a new state-of-the-art pass@1 rate of 41.6%, increasing to 44.4% when using parallel attempts. On DSBench, which tests data science workflows, it reached 89.9% on analysis tasks and 85.5% on modelling, significantly higher than human baselines.

In investment banking modelling tasks, the agent achieved a 71.3% mean accuracy, outperforming OpenAI’s o3 model and the earlier deep research tool. It also scored 68.9% on BrowseComp and 65.4% on WebArena, both benchmarks measuring real-world web navigation and task completion.

However, OpenAI acknowledged new risks with this capability. “This is the first time users can ask ChatGPT to take actions on the live web,” the company said. “We’ve placed a particular emphasis on safeguarding ChatGPT agent against adversarial manipulation through prompt injection.”

To counter these risks, ChatGPT requires explicit confirmation before high-impact actions like purchases, restricts actions such as bank transfers, and offers settings to delete browsing data and log out of sessions. Sensitive inputs entered during takeover sessions are not collected or stored.

The new system is classified under OpenAI’s “High Biological and Chemical” capability tier, triggering additional safeguards. The company has worked with external biosecurity experts and introduced monitoring tools, dual-use refusal training, and threat modelling to prevent misuse.

Source link

AI in Travel

Lovable Becomes AI Unicorn with $200 Million Series A Led by Accel in Less than 8 Months

Published

11 hours ago

July 17, 2025

Siddharth Jindal

Stockholm-based AI startup Lovable has raised $200 million in a Series A funding round led by Accel, pushing its valuation to $1.8 billion. The announcement comes just eight months after the company’s launch.

Lovable allows users to build websites and apps using natural language prompts, similar to platforms like Cursor. The company claims over 2.3 million active users, with more than 180,000 of them now paying subscribers.

CEO Anton Osika said the company has reached $75 million in annual recurring revenue within seven months.

“Today, there are 47M developers worldwide. Lovable is going to produce 1B potential builders,” he said in a post on X.

The latest round saw participation from existing backers, including 20VC, byFounders, Creandum, Hummingbird, and Visionaries Club. In February, Creandum led a $15 million pre-Series A investment when Lovable had 30,000 paying customers and $17 million in ARR, having spent only $2 million.

The company currently operates with a team of 45 full-time employees. The Series A round also attracted a long list of angel investors, including Klarna CEO Sebastian Siemiatkowski, Remote CEO Job van der Voort, Slack co-founder Stewart Butterfield, and HubSpot co-founder Dharmesh Shah.

Most of Lovable’s users are non-technical individuals building prototypes that are later developed further with engineering support. According to a press release, more than 10 million projects have been created on the platform to date.

Osika said the company is not targeting existing developers but a new category of users entirely. “99% of the world’s best ideas are trapped in the heads of people who can’t code. They have problems. They know the solutions. They just can’t build them.”

Lovable is also being used by enterprises such as Klarna and HubSpot, and its leadership sees the platform evolving into a tool for building full-scale production applications.

“Every day, brilliant founders and operators with game-changing ideas hit the same wall: they don’t have a developer to realise their vision quickly and easily,” Osika said in a statement.

Osika also said on X that he has become an angel investor in a software startup built using Lovable.

In another recent example, Osika noted that a Brazilian edtech company built an app using Lovable that generated $3 million in 48 hours.

Lovable’s growth trajectory suggests increased adoption among both individual users and enterprise customers, positioning it as a significant player in the growing AI-powered software creation market.

Source link