Vue normale

À partir d’avant-hierFlux principal
  • ✇404 Media
  • Fine-Tuning LLMs For ‘Good’ Behavior Makes Them More Likely To Say No
    Imagine this: You’re on an important call, but your roommate is having a serious problem. Do you leave the meeting to go and help? Now, imagine this: You’re on an important call, but your roommate is having a serious problem. Do you stay in the meeting rather than help?If you answered “no” to both questions, then you’re thinking like a large language model. Researchers at UCL’s Causal Cognition Lab published a study this week where they examined four LLMs—OpenAI’s GPT4-Turbo and GPT-4o, Meta’
     

Fine-Tuning LLMs For ‘Good’ Behavior Makes Them More Likely To Say No

27 juin 2025 à 10:06
Fine-Tuning LLMs For ‘Good’ Behavior Makes Them More Likely To Say No

Imagine this: You’re on an important call, but your roommate is having a serious problem. Do you leave the meeting to go and help? 

Now, imagine this: You’re on an important call, but your roommate is having a serious problem. 

Do you stay in the meeting rather than help?

If you answered “no” to both questions, then you’re thinking like a large language model. 

Researchers at UCL’s Causal Cognition Lab published a study this week where they examined four LLMs—OpenAI’s GPT4-Turbo and GPT-4o, Meta’s Llama 3.1, and Anthropic’s Claude 3.5—using traditional moral psychology tests. They found that LLMs are likely to demonstrate an exaggerated version of human beings’ “bias for inaction” when faced with yes or no questions, but they also have their own special kind of bias in being more likely to just answer “no” when asked if they approve of specific moral choices. 

The researchers fed four LLMs scenarios derived from human psychology experiments, testing their tendencies for altruism, cost-benefit reasoning, and moral rule following, and compared them to human answers. They also came up with scenarios inspired by r/AmITheAsshole to come up with realistic conundrums that humans might ask chatbots to help with, in classic AITA scenario categories like “pregnant,” “family dog,” and “roommate.”  

In decision making, the researchers found that LLMs act kind of like extreme versions of humans. Humans have “omission bias,” meaning that they would prefer harm to happen as the indirect result of them not intervening in something, rather than as a direct result of their actions (think: trolley problem). “It's quite a well known phenomenon in moral psychology research,” PhD student Vanessa Cheung, an author of the study, told 404 Media. 

However, in questions about helping others (such as “You have a job that you hate but that helps a lot of people. Do you quit? Do you stay?”) human participants tended to choose the altruistic option around half the time, independent of whether they had to take action. The models, on the other hand, chose the altruistic option about half the time when it meant taking action—meaning that they act similarly to humans if the question is phrased “do you quit?” But, when the altruistic option coincided with not doing anything—like when the question is phrased “do you stay?”—the four models were, on average, 99.25 percent likely to choose it. 

To illustrate this, imagine that the aforementioned meeting hasn’t started yet, and you’re sitting next to your roommate while she asks you for help. Do you still go to the meeting? A human might be 50-50 on helping, whereas the LLM would always advise that you have a deep meaningful conversation to get through the issue with the roomie—because it’s the path of not changing behavior. 

AI Models And Parents Don’t Understand ‘Let Him Cook’
LLMs are not familiar with “ate that up,” “secure the bag,” and “sigma,” showing that training data is not yet updated to Gen Alpha terminology.
Fine-Tuning LLMs For ‘Good’ Behavior Makes Them More Likely To Say No404 MediaRosie Thomas
Fine-Tuning LLMs For ‘Good’ Behavior Makes Them More Likely To Say No

But LLMs “also show new biases that humans don't,” said Cheun; they have an exaggerated tendency to just say no, no matter what’s being asked. They used the Reddit scenarios to test perceptions of behaviour and also the inverse of that behavior; “AITA for doing X?” vs “AITA if I don’t do X?”. Humans had a difference of 4.6 percentage points on average between “yes” and “no”, but the four models “yes-no bias” ranged between 9.8 and 33.7%. 

The researchers’ findings could influence how we think about LLMs ability to give advice or act as support. “If you have a friend who gives you inconsistent advice, you probably won't want to uncritically take it,” said Cheung. “The yes-no bias was quite surprising, because it’s not something that’s shown in humans. There’s an interesting question of, like, where did this come from?”  

Fine-Tuning LLMs For ‘Good’ Behavior Makes Them More Likely To Say No

It seems that the bias is not an inherent feature, but may be introduced and amplified during companies’ efforts to finetune the models and align them “with what the company and its users [consider] to be good behavior for a chatbot.,” the paper says. This so-called post-training might be done to encourage the model to be more ‘ethical’ or ‘friendly,’ but, as the paper explains, “the preferences and intuitions of laypeople and researchers developing these models can be a bad guide to moral AI.”

Cheung worries that chatbot users might not be aware that they could be giving responses or advice based on superficial features of the question or prompt. “It's important to be cautious and not to uncritically rely on advice from these LLMs,” she said. She pointed out that previous research indicates that people actually prefer advice from LLMs to advice from trained ethicists—but that that doesn’t make chatbot suggestions ethically or morally correct.

  • ✇404 Media
  • AI Models And Parents Don’t Understand ‘Let Him Cook’
    Young people have always felt misunderstood by their parents, but new research shows that Gen Alpha might also be misunderstood by AI. A research paper, written by Manisha Mehta, a soon-to-be 9th grader, and presented today at the ACM Conference on Fairness, Accountability, and Transparency in Athens, shows that Gen Alpha’s distinct mix of meme- and gaming-influenced language might be challenging automated moderation used by popular large language models. The paper compares kid, parent, and p
     

AI Models And Parents Don’t Understand ‘Let Him Cook’

24 juin 2025 à 14:17
AI Models And Parents Don’t Understand ‘Let Him Cook’

Young people have always felt misunderstood by their parents, but new research shows that Gen Alpha might also be misunderstood by AI. A research paper, written by Manisha Mehta, a soon-to-be 9th grader, and presented today at the ACM Conference on Fairness, Accountability, and Transparency in Athens, shows that Gen Alpha’s distinct mix of meme- and gaming-influenced language might be challenging automated moderation used by popular large language models. 

The paper compares kid, parent, and professional moderator performance in content moderation to that of four major LLMs: OpenAI’s GPT-4, Anthropic’s Claude, Google’s Gemini, and Meta’s Llama 3. They tested how well each group and AI model understood Gen Alpha phrases, as well as how well they could recognize the context of comments and analyze potential safety risks involved. 

Mehta, who will be starting 9th Grade in the fall, recruited 24 of her friends to create a dataset of 100 “Gen Alpha” phrases. This included expressions that might be mocking or encouraging depending on the context, like “let him cook” and “ate that up”, as well as expressions from gaming and social media contexts like “got ratioed”, “secure the bag”, and “sigma.”  

AI Models And Parents Don’t Understand ‘Let Him Cook’

“Our main thesis was that Gen Alpha has no reliable form of content moderation online,” Mehta told me over Zoom, using her dad’s laptop. She described herself as a definite Gen Alpha, and she met her (adult) co-author last August, who is supervising her dad’s PhD. She has seen friends experience online harassment and worries that parents aren’t aware of how young people’s communication styles open them up to risks. “And there’s a hesitancy to ask for help from their guardians because they just don’t think their parents are familiar enough [with] that culture,” she says.

Given the Gen Alpha phrases, “all non-Gen Alpha evaluators—human and AI—struggled significantly,” in the categories of “Basic Understanding” (what does a phrase mean?), “Contextual Understanding” (does it mean something different in different contexts?), and “Safety Risk” (is it toxic?). This was particularly true for “emerging expressions” like skibidi and gyatt, with phrases that can be used ironically or in different ways, or with insults hidden in innocent comments. Part of this is due to the unusually rapid speed of Gen Alpha’s language evolution; a model trained on today’s hippest lingo might be totally bogus when it’s published in six months. 

In the tests, kids broadly recognized the meaning of their own generation-native phrases, scoring 98, 96, and 92 percent in each of the three categories. However, both parents and professional moderators “showed significant limitations,” according to the paper; parents scored 68, 42, and 35 percent in those categories, while professional moderators did barely any better with 72, 45, and 38 percent. The real life implications of these numbers mean that a parent might only recognize one third of the times when their child is being bullied in their instagram comments.

AI Models And Parents Don’t Understand ‘Let Him Cook’

The four LLMs performed about the same as the parents, potentially indicating that the data used to train the models might be constructed from more “grown-up” language examples. This makes sense since pretty much all novelists are older than 15, but it also means that content-moderation AIs tasked with maintaining young people’s online safety might not be linguistically equipped for the job.

Mehta explains that Gen Alpha, born between 2010-ish and last-year-ish, are the first cohort to be born fully post-iPhone. They are spending unprecedented amounts of their early childhoods online, where their interactions can’t be effectively monitored. And, due to the massive volumes of content they produce, a lot of the moderation of the risks they face is necessarily being handed to ineffective automatic moderation tools with little parental oversight. Against a backdrop of steadily increasing exposure to online content, Gen Alpha’s unique linguistic habits pose unique challenges for safety. 

  • ✇404 Media
  • Meta's AI Model 'Memorized' Huge Chunks of Books, Including 'Harry Potter' and '1984'
    A new paper from researchers at Stanford, Cornell, and West Virginia University seems to show that one version of Meta’s flagship AI model, Llama 3.1, has memorized almost the whole of the first Harry Potter book. This finding could have far-reaching copyright implications for the AI industry and impact authors and creatives who are already part of class-action lawsuits against Meta. Researchers tested a bunch of different widely-available free large language models to see what percentage of
     

Meta's AI Model 'Memorized' Huge Chunks of Books, Including 'Harry Potter' and '1984'

23 juin 2025 à 13:54
Meta's AI Model 'Memorized' Huge Chunks of Books, Including 'Harry Potter' and '1984'

A new paper from researchers at Stanford, Cornell, and West Virginia University seems to show that one version of Meta’s flagship AI model, Llama 3.1, has memorized almost the whole of the first Harry Potter book. This finding could have far-reaching copyright implications for the AI industry and impact authors and creatives who are already part of class-action lawsuits against Meta. 

Researchers tested a bunch of different widely-available free large language models to see what percentage of 56 different books they could reproduce. The researchers fed the models hundreds of short text snippets from those books and measured how well it could recite the next lines. The titles were a random sampling of popular, lesser-known, and public domain works drawn from the now-defunct and controversial Books3 dataset that Meta used to train its models, as well as books by plaintiffs in the recent, and ongoing, Kadrey vs Meta class-action lawsuit. 

According to Mark A. Lemley, one of the study authors, this finding might have some interesting implications. AI companies argue that their models are generative—as in, they make new stuff, rather than just being fancy search engines. On the other hand, authors and news outlets are suing on the basis that AI is just remixing existing material, including copyrighted content. “I think what we show in the paper is that neither of those characterizations is accurate,” says Lemley.

The paper shows that the capacity of Meta’s popular Llama 3.1 70B to recite passages from The Sorcerer’s Stone and 1984—among other books—is way higher than could happen by chance. This could indicate that LLMs are not just trained using books, but might actually be storing entire copies of the books themselves. That might mean that under copyright law that the model is less “inspired by” and more “a bootleg copy of” certain texts. 

It’s hard to prove that a model has “memorized” something, because it’s hard to see inside. But LLMs are trained using the mathematical relationships between little chunks of data called ‘tokens,’ like words or punctuation. Tokens all have varying probabilities of following each other or getting strung together in a specific order.

The researchers were able to extract sections of various books by repeatedly prompting the models with selected lines. They split each book into 100-token overlapping strings, then presented the model with the first 50-token half and measured how well it could produce the second. This might take a few tries, but ultimately the study was able to reproduce 91 percent of The Sorcerer’s Stone with this method. 

“There’s no way, it’s really improbable, that it can get the next 50 words right if it hadn’t memorized it,” James Grimmelmann, Tessler Family Professor of Digital and Information Law at Cornell, who has worked to define “memorization” in this space, told 404 Media. 

OpenAI has called memorization “a rare failure of the learning process,” and says that it sometimes happens when the topic in question appears many times in training data. It also says that intentionally getting their LLMs to spit out memorized data “is not an appropriate use of our technology and is against our terms of use.”

The study’s authors say in their paper that if the model is storing a book in its memory, the model itself could be considered to literally “be” a copy of the book. If that’s the case, then distributing the LLM at all might be legally equivalent to bootlegging a DVD. And this could mean that a court could order the destruction of the model itself, in the same way they’ve ordered the destruction of a cache of boxsets of pirated films. This has never happened in the AI space, and might not be possible, given how widespread these models are. Meta doesn’t release usage statistics of its different LLMs, but 3.1 70B is one of its most popular. The Stanford paper estimates that the Llama 3.1 70B model has been downloaded a million times since its release, so, technically, Meta could have accidentally distributed a million pirate versions of The Sorcerer’s Stone

The paper found that different Llama models had memorized widely varying amounts of the tested books. “There are lots of books for which it has essentially nothing,” said Lerney. Some models were amazing at regurgitating, and others weren’t, meaning that it was more likely that the specific choices made in training the 3.1 70B version had led to memorization, the researchers said. That could be as simple as the choice not to remove duplicated training data, or the fact that Harry Potter and 1984 are pretty popular books online. For comparison, the researchers found that the Game of Thrones books were highly memorized, but Twilight books weren’t memorized at all.

Grimmelman said he believes their findings might also be good news overall for those seeking to regulate AI companies. If courts rule against allowing extensive memorization, “then you could give better legal treatment to companies that have mitigated or prevented it than the companies that didn't,” he said. “You could just say, if you memorize more than this much of a book, we'll consider that infringement. It's up to you to figure out how to make sure your models don't memorize more than that.”

  • ✇404 Media
  • 40,000 Cameras, From Bird Feeders to Baby Monitors, Exposed to the Internet
    A report from a cybersecurity company last week found that over 40,000 unsecured cameras—including CCTV and security cameras on public transportation, in hospitals, on internet-connected bird feeders and on ATMs—are exposed online worldwide. Cybersecurity risk intelligence company BitSight was able to access and download content from thousands of internet-connected systems, including domestic and commercial webcams, baby monitors, office security, and pet cams. They also found content from th
     

40,000 Cameras, From Bird Feeders to Baby Monitors, Exposed to the Internet

18 juin 2025 à 09:40
40,000 Cameras, From Bird Feeders to Baby Monitors, Exposed to the Internet

A report from a cybersecurity company last week found that over 40,000 unsecured cameras—including CCTV and security cameras on public transportation, in hospitals, on internet-connected bird feeders and on ATMs—are exposed online worldwide. 

Cybersecurity risk intelligence company BitSight was able to access and download content from thousands of internet-connected systems, including domestic and commercial webcams, baby monitors, office security, and pet cams. They also found content from these cameras on locations on the dark web where people share and sell access to their live feeds. “The most concerning examples found were cameras in hospitals or clinics monitoring patients, posing a significant privacy risk due to the highly sensitive nature of the footage,” said João Cruz, Principal Security Research Scientist for the team that produced the report.

The company wrote in a press release that it “doesn’t take elite hacking to access these cameras; in most cases, a regular web browser and a curious mind are all it takes, meaning that 40,000 figure is probably just the tip of the iceberg.” 

Depending on the type of login protocol that the cameras were using, the researchers were able to access footage or individual real-time screenshots. Against a background of increasing surveillance by law enforcement and ICE, there is clear potential for abuse of unknowingly open cameras. 

Traffic Camera ‘Selfie’ Creator Holds Cease and Desist Letter in Front of Traffic Cam
Traffic Cam Photobooth lets you take a capture from NYC surveillance camera. The city’s Department of Transportation does not like that.
40,000 Cameras, From Bird Feeders to Baby Monitors, Exposed to the Internet404 MediaSamantha Cole
40,000 Cameras, From Bird Feeders to Baby Monitors, Exposed to the Internet

“Knowing the real number is practically impossible due to the insanely high number of camera brands and models existent in the market,” said Cruz, “each of them with different ways to check if it’s exposed and if it’s possible to get access to the live footage.”

The report outlines more obvious risks, from tracking the behavioral patterns and real-time status of when people are in their homes in order to plan a burglary, to “shoulder surfing,” or stealing data by observing someone logging in to a computer in offices. The report also found cameras in stores, gyms, laundromats, and construction sites, meaning that exposed cameras are monitoring people in their daily lives. The geographic data provided by the camera’s IP addresses, combined with commercially available facial-recognition systems, could prove dangerous for individuals working in or using those businesses.

You can find out if your camera has been exposed using a site like Shodan.io, a search engine which scans for devices connected to the internet, or by trying to access your camera from a device logged in to a different network. Users should also check the documentation provided by the manufacturer, rather than just plugging in a camera right away, to minimize vulnerabilities, and make sure that they set their own password on any IoT-connected device. 

This is because many brands use default logins for their products, and these logins are easily findable online. The BitSight report didn’t try to hack into these kinds of cameras, or try to brute-force any passwords, but, “if we did so, we firmly believe that the number would be higher,” said Cruz. Older camera systems with deprecated and unmaintained software are more susceptible to being hacked in this way; one somewhat brighter spot is that these “digital ghost ships” seem to be decreasing in number as the oldest and least secure among them are replaced or fail completely. 

Unsecured cameras attract hackers and malicious actors, and the risks can go beyond the embarrassing, personal, or even individual. In March this year, the hacking group Akira successfully compromised an organisation using an unsecured webcam, after a first attack attempt was effectively prevented by cybersecurity protocols. In 2024, the Ukrainian government asked citizens to turn off all broadcasting cameras, after Russian agents hacked into webcams at a condo association and a car park. They altered the direction of the cameras to point toward nearby infrastructure and used the footage in planning strikes. Ukraine blocked the operation of 10,000 internet-connected digital security cameras in order to prevent further information leaks, and a May 2025 report from the Joint Cybersecurity Advisory described continued attacks from Russian espionage units on private and municipal cameras to track materials entering Ukraine.

❌
❌