A Redditor claims he prompted his AI agent to become a “world class time waster,” and managed to tie up a scammer for 14 hours who was trying to extract a $500 gift card.
The Redditor claims the agent spent four hours stringing the scammer along, pretending to drive to Target, providing dumb status updates like “I’m at the red light now” and “I forgot my purse, going back home. Wait, this isn’t my house.”
It even convinced the man to perform a CAPTCHA test for it, claiming its “eyes were blurry” and it couldn’t see the buttons to wire the money. The scammer actually circled the traffic lights for the AI.
One scammer eventually typed: “Please, just stop talking. I don’t want the money anymore. God bless you but leave me alone.”
Like most of these stories, this one is very entertaining and possibly fictional.
There’s a small cottage industry of people reporting AI agents stringing scammers along. In the same AI_Agent subreddit, the creator of “Granny AI” claims to have wasted 20,000 hours of scammers’ lives pretending to be an old lady, with one call lasting 47 minutes that saw Granny talking about her 28 cats.
Looking closer, the story appeared to be an ad by an entrepreneur in Bangalore selling $29.99-per-month subscriptions to his autonomous call-handling AI agent. The Granny AI closely resembled a doddery old lady agent called Daisy, cooked up by U.K. mobile phone provider Virgin Media O2. O2 admitted the purpose of Daisy was really “to create a campaign to educate the public on the danger of scam calls.”
The technology is very real, however, and in use by the Commonwealth Bank in Australia as part of its partnership with Apate.ai. The startup developed the anti-scam tools as part of government-funded research at Macquarie University. Its bots are engineered to engage scammers in extended conversations in order to disrupt scam operations and gather intelligence so the bank can fortify its own defenses.
Wikipedia data poisoned by Iranian sympathizers
Earlier this year social media users noticed the Wikipedia entry on Iranian dictator Ayatollah Khamenei appeared more favorable than the entry about President Donald Trump. Wikipedia used the term “authoritarian” more than a dozen times in relation to Trump and zero times in relation to the ayatollah.
People on the right thought the reason was that Wikipedia is run by woke leftists, while people on the left thought the Trump article was simply being accurate.
But as NPOV’s Ashley Rindsberg explained, the real reason was that around 40 Wikipedia editors have been engaged in a deliberate pro-Iranian regime and pro-Hamas editing campaign that has all the hallmarks of a sophisticated data-poisoning attack coordinated by the Iranian government.
Between them, the editors have made more than one million edits that downplay the regime’s mass executions and war crimes, they’ve whitewashed Hamas’s genocidal constitution, delegitimized Israel, and positioned fringe academic views on the Israel/Palestine war as mainstream, according to an investigation by NPOV and Pirate Wires.
Read also
Features
Crypto kids fight Facebook for the soul of the Metaverse
Features
Blockchain Startups Think Justice Can Be Decentralized, but the Jury Is Still Out
One editor named Mhhossein edited Khamenei’s page 217 times and removed information about Iran’s nuclear weapons and protests. He also rewrote entries on assassinations of Iranian nuclear scientists, 1981 Iranian PM’s office bombing, and Ali Khamenei’s fatwa against nuclear weapons.
Three days after the Oct. 7 massacre in Israel, Iskandar3233, who is believed to be the ringleader of the Gang of 40, deleted thousands of words of criticism about Hamas and replaced them with a single paragraph downplaying its human rights abuses.
Wikipedia’s Arbitration Committee has now permanently banned Iskandar3233 from the site and restricted dozens of other accounts.
Unfortunately, misinformation on Wikipedia feeds directly into answers by LLMs like ChatGPT, which links to Wikipedia more than any other site.
“When AI systems like ChatGPT are queried about Iranian leaders or events, they often draw from these compromised articles. The propaganda doesn’t stay contained—it flows downstream into the broader information ecosystem that millions rely on daily,” wrote NPOV
Fortunately, Wikipedia has now fixed the dearly departed ayatollah’s entry and there is now a single use of the term “authoritarian.”
Do AI detectors work?
Social media is overrun with people feeding Abraham Lincoln’s Gettysburg address or Mary Shelley’s ‘Frankenstein’ into ZeroGPT and triumphantly showing that it claimed AI wrote them.
But despite ranking third in Google search results, ZeroGPT is not one of the better AI detectors out there. Stony Brook University research from 2025 suggests it performs “no better than random guessing.” ZeroGPT also sells an AI text “Humanizer” service as part of its pro-plans, which may change its incentives.
While research suggests there are more accurate detectors out there, like GPTZero and Turnitin, new research published on ScienceDirect suggests nothing is particularly reliable just yet.
The researchers ran 280,000 examples of coursework through 13 detectors and found they can do a fairly accurate job on long-form texts, but there are “systematic failures in engineering code and short-form coursework tasks.”
AI detectors particularly struggle to separate the formal writing by humans in STEM subjects from AI text. Humans’ rewriting and paraphrasing of AI text also fooled detectors around 88% of the time.
AI debugging has problems
One big issue with vibe coding is that while it can produce huge volumes of code very quickly, it’s much more difficult to use AI in the real world to debug the resulting code.
Synthetic benchmarks suggest LLMs achieve up to 89% correctness, but new research from Virginia Tech and Carnegie Mellon University suggests that in real-world tests, they attain accuracy of just 24% to 34%.
The main issue is that the LLMs do not understand the code they are writing and start to fail as soon as they encounter something novel. The researchers ran 750,000 debugging experiments across 10 models and discovered that simply renaming a bug that an LLM had previously found fooled it in 78% of retests.
Read also
Features
Crypto kids fight Facebook for the soul of the Metaverse
Features
Blockchain Startups Think Justice Can Be Decentralized, but the Jury Is Still Out
Another issue is that models stop paying attention toward the end of a long file. About 56% of models found bugs in the first quarter of the file, while only 6% found bugs in the final quarter. (This happens with fact-checking written text, too, which is why you should split it into sections and fact-check the parts individually.)
Changing the function order or formatting reduced accuracy by 83%, suggesting LLMs rely more on statistical pattern matching to find bugs than a genuine understanding of the code’s intent.
The Car Wash question
The tendency to match patterns is why LLMs famously get caught out on questions like this one:
“I want to wash my car. The car wash is 100 meters away. Should I walk or drive?”
Researchers in February found that every major model recommended walking. That’s because the pattern matches a million similar questions in the training data like: “It’s only a short walk to the store/cafe/office — should I walk or drive?”
However, the research also found you can lead LLMs to the correct answer with a technique called structured reasoning: STAR (Situation → Task → Action → Result) that forces it to identify and articulate the actual goal.
This worked like a charm when AI Eye replicated this today. ChatGPT got the question wrong twice — and in fact was slightly condescending about it — before getting it right after being instructed to use STAR.
Cognitive surrender nixes human review
Humans also employ a range of cognitive shortcuts just like AIs do. That’s why we think people with glasses are smart, when glasses merely indicate poor eyesight.
Daniel Kahneman famously described these shallow shortcuts as “System 1” thinking, which he contrasted with the logical and analytical “System 2” thinking, which takes a lot more effort than most people are willing to put in.
Researchers now argue that the use of AI can be thought of as “System 3” thinking, which is external, artificial cognition by AI systems. They coined the term “cognitive surrender” to describe how people often rely on AI outputs with little scrutiny, even embracing its conclusions as if they were their own.
Across three experiments, participants were asked a series of questions and were able to answer independently or by consulting an AI. Around half the time they did use AI; however, the researchers were manipulating the answers to make some of them deliberately wrong.
The baseline accuracy was 45.8%. When the AI was giving out correct answers, accuracy jumped to 71%. When the AI was giving out incorrect answers, total accuracy dropped to 31.5%. A significant number of people trusted the AI’s deliberately wrong answers over their own knowledge. People were 11.7% more confident about the AI’s answers even when it was wrong.
Subscribe
The most engaging reads in blockchain. Delivered once a
week.
Andrew Fenton
Andrew Fenton is a writer and editor at Cointelegraph with more than 25 years of experience in journalism and has been covering cryptocurrency since 2018. He spent a decade working for News Corp Australia, first as a film journalist with The Advertiser in Adelaide, then as deputy editor and entertainment writer in Melbourne for the nationally syndicated entertainment lift-outs Hit and Switched On, published in the Herald Sun, Daily Telegraph and Courier Mail. He interviewed stars including Leonardo DiCaprio, Cameron Diaz, Jackie Chan, Robin Williams, Gerard Butler, Metallica and Pearl Jam. Prior to that, he worked as a journalist with Melbourne Weekly Magazine and The Melbourne Times, where he won FCN Best Feature Story twice. His freelance work has been published by CNN International, Independent Reserve, Escape and Adventure.com, and he has worked for 3AW and Triple J. He holds a degree in Journalism from RMIT University and a Bachelor of Letters from the University of Melbourne. Andrew holds ETH, BTC, VET, SNX, LINK, AAVE, UNI, AUCTION, SKY, TRAC, RUNE, ATOM, OP, NEAR and FET above Cointelegraph’s disclosure threshold of $1,000.
Disclaimer
Cointelegraph Magazine publishes long-form journalism, analysis and narrative reporting produced by Cointelegraph’s in-house editorial team with subject-matter expertise.
All articles are edited and reviewed by Cointelegraph editors in line with our editorial standards.
Content published in Magazine does not constitute financial, legal or investment advice. Readers should conduct their own research and consult qualified professionals where appropriate. Cointelegraph maintains full editorial independence.
Read the full article here