Category:

Editor’s Pick

Enlarge / An illustration provided by Google. (credit: Google)

On Thursday, Google DeepMind announced that AI systems called AlphaProof and AlphaGeometry 2 reportedly solved four out of six problems from this year’s International Mathematical Olympiad (IMO), achieving a score equivalent to a silver medal. The tech giant claims this marks the first time an AI has reached this level of performance in the prestigious math competition—but as usual in AI, the claims aren’t as clear-cut as they seem.

Google says AlphaProof uses reinforcement learning to prove mathematical statements in the formal language called Lean. The system trains itself by generating and verifying millions of proofs, progressively tackling more difficult problems. Meanwhile, AlphaGeometry 2 is described as an upgraded version of Google’s previous geometry-solving AI modeI, now powered by a Gemini-based language model trained on significantly more data.

According to Google, prominent mathematicians Sir Timothy Gowers and Dr. Joseph Myers scored the AI model’s solutions using official IMO rules. The company reports its combined system earned 28 out of 42 possible points, just shy of the 29-point gold medal threshold. This included a perfect score on the competition’s hardest problem, which Google claims only five human contestants solved this year.

Read 9 remaining paragraphs | Comments

0 comment
0 FacebookTwitterPinterestEmail

Enlarge (credit: Benj Edwards / OpenAI)

Arguably, few companies have unintentionally contributed more to the increase of AI-generated noise online than OpenAI. Despite its best intentions—and against its terms of service—its AI language models are often used to compose spam, and its pioneering research has inspired others to build AI models that can potentially do the same. This influx of AI-generated content has further reduced the effectiveness of SEO-driven search engines like Google. In 2024, web search is in a sorry state indeed.

It’s interesting, then, that OpenAI is now offering a potential solution to that problem. On Thursday, OpenAI revealed a prototype AI-powered search engine called SearchGPT that aims to provide users with quick, accurate answers sourced from the web. It’s also a direct challenge to Google, which also has tried to apply generative AI to web search (but with little success).

The company says it plans to integrate the most useful aspects of the temporary prototype into ChatGPT in the future. ChatGPT can already perform web searches using Bing, but SearchGPT seems to be a purpose-built interface for AI-assisted web searching.

Read 12 remaining paragraphs | Comments

0 comment
0 FacebookTwitterPinterestEmail

(credit: Chrome)

Google is redesigning Chrome malware detections to include password-protected executable files that users can upload for deep scanning, a change the browser maker says will allow it to detect more malicious threats.

Google has long allowed users to switch on the Enhanced Mode of its Safe Browsing, a Chrome feature that warns users when they’re downloading a file that’s believed to be unsafe, either because of suspicious characteristics or because it’s in a list of known malware. With Enhanced Mode turned on, Google will prompt users to upload suspicious files that aren’t allowed or blocked by its detection engine. Under the new changes, Google will prompt these users to provide any password needed to open the file.

Beware of password-protected archives

In a post published Wednesday, Jasika Bawa, Lily Chen, and Daniel Rubery of the Chrome Security team wrote:

Read 6 remaining paragraphs | Comments

0 comment
0 FacebookTwitterPinterestEmail

Enlarge (credit: sasha85ru | Getty Imates)

In 2012, an industry-wide coalition of hardware and software makers adopted Secure Boot to protect against a long-looming security threat. The threat was the specter of malware that could infect the BIOS, the firmware that loaded the operating system each time a computer booted up. From there, it could remain immune to detection and removal and could load even before the OS and security apps did.

The threat of such BIOS-dwelling malware was largely theoretical and fueled in large part by the creation of ICLord Bioskit by a Chinese researcher in 2007. ICLord was a rootkit, a class of malware that gains and maintains stealthy root access by subverting key protections built into the operating system. The proof of concept demonstrated that such BIOS rootkits weren’t only feasible; they were also powerful. In 2011, the threat became a reality with the discovery of Mebromi, the first-known BIOS rootkit to be used in the wild.

Keenly aware of Mebromi and its potential for a devastating new class of attack, the Secure Boot architects hashed out a complex new way to shore up security in the pre-boot environment. Built into UEFI—the Unified Extensible Firmware Interface that would become the successor to BIOS—Secure Boot used public-key cryptography to block the loading of any code that wasn’t signed with a pre-approved digital signature. To this day, key players in security—among them Microsoft and the US National Security Agency—regard Secure Boot as an important, if not essential, foundation of trust in securing devices in some of the most critical environments, including in industrial control and enterprise networks.

Read 36 remaining paragraphs | Comments

0 comment
0 FacebookTwitterPinterestEmail

Enlarge

In June, Runway debuted a new text-to-video synthesis model called Gen-3 Alpha. It converts written descriptions called “prompts” into HD video clips without sound. We’ve since had a chance to use it and wanted to share our results. Our tests show that careful prompting isn’t as important as matching concepts likely found in the training data, and that achieving amusing results likely requires many generations and selective cherry-picking.

An enduring theme of all generative AI models we’ve seen since 2022 is that they can be excellent at mixing concepts found in training data but are typically very poor at generalizing (applying learned “knowledge” to new situations the model has not explicitly been trained on). That means they can excel at stylistic and thematic novelty but struggle at fundamental structural novelty that goes beyond the training data.

What does all that mean? In the case of Runway Gen-3, lack of generalization means you might ask for a sailing ship in a swirling cup of coffee, and provided that Gen-3’s training data includes video examples of sailing ships and swirling coffee, that’s an “easy” novel combination for the model to make fairly convincingly. But if you ask for a cat drinking a can of beer (in a beer commercial), it will generally fail because there aren’t likely many videos of photorealistic cats drinking human beverages in the training data. Instead, the model will pull from what it has learned about videos of cats and videos of beer commercials and combine them. The result is a cat with human hands pounding back a brewsky.

Read 26 remaining paragraphs | Comments

0 comment
0 FacebookTwitterPinterestEmail

Enlarge / CrowdStrike’s Falcon security software brought down as many as 8.5 million Windows PCs over the weekend. (credit: CrowdStrike)

Security firm CrowdStrike has posted a preliminary post-incident report about the botched update to its Falcon security software that caused as many as 8.5 million Windows PCs to crash over the weekend, delaying flights, disrupting emergency response systems, and generally wreaking havoc.

The detailed post explains exactly what happened: At just after midnight Eastern time, CrowdStrike deployed “a content configuration update” to allow its software to “gather telemetry on possible novel threat techniques.” CrowdStrike says that these Rapid Response Content updates are tested before being deployed, and one of the steps involves checking updates using something called the Content Validator. In this case, “a bug in the Content Validator” failed to detect “problematic content data” in the update responsible for the crashing systems.

CrowdStrike says it is making changes to its testing and deployment processes to prevent something like this from happening again. The company is specifically including “additional validation checks to the Content Validator” and adding more layers of testing to its process.

Read 4 remaining paragraphs | Comments

0 comment
0 FacebookTwitterPinterestEmail

Enlarge / Elon Musk, chief executive officer of Tesla Inc., during a fireside discussion on artificial intelligence risks with Rishi Sunak, UK prime minister, in London, UK, on Thursday, Nov. 2, 2023. (credit: Getty Images)

On Monday, Elon Musk announced the start of training for what he calls “the world’s most powerful AI training cluster” at xAI’s new supercomputer facility in Memphis, Tennessee. The billionaire entrepreneur and CEO of multiple tech companies took to X (formerly Twitter) to share that the so-called “Memphis Supercluster” began operations at approximately 4:20 am local time that day.

Musk’s xAI team, in collaboration with X and Nvidia, launched the supercomputer cluster featuring 100,000 liquid-cooled H100 GPUs on a single RDMA fabric. This setup, according to Musk, gives xAI “a significant advantage in training the world’s most powerful AI by every metric by December this year.”

Given issues with xAI’s Grok chatbot throughout the year, skeptics would be justified in questioning whether those claims will match reality, especially given Musk’s tendency for grandiose, off-the-cuff remarks on the social media platform he runs.

Read 7 remaining paragraphs | Comments

0 comment
0 FacebookTwitterPinterestEmail

Enlarge / The cityscape from the tower of the Lviv Town Hall in winter. (credit: Anastasiia Smolienko / Ukrinform/Future Publishing via Getty Images)

As Russia has tested every form of attack on Ukraine’s civilians over the past decade, both digital and physical, it’s often used winter as one of its weapons—launching cyberattacks on electric utilities to trigger December blackouts and ruthlessly bombing heating infrastructure. Now it appears Russia-based hackers last January tried yet another approach to leave Ukrainians in the cold: a specimen of malicious software that, for the first time, allowed hackers to reach directly into a Ukrainian heating utility, switching off heat and hot water to hundreds of buildings in the midst of a winter freeze.

Industrial cybersecurity firm Dragos on Tuesday revealed a newly discovered sample of Russia-linked malware that it believes was used in a cyberattack in late January to target a heating utility in Lviv, Ukraine, disabling service to 600 buildings for around 48 hours. The attack, in which the malware altered temperature readings to trick control systems into cooling the hot water running through buildings’ pipes, marks the first confirmed case in which hackers have directly sabotaged a heating utility.

Dragos’ report on the malware notes that the attack occurred at a moment when Lviv was experiencing its typical January freeze, close to the coldest time of the year in the region, and that “the civilian population had to endure sub-zero [Celsius] temperatures.” As Dragos analyst Kyle O’Meara puts it more bluntly: “It’s a shitty thing for someone to turn off your heat in the middle of winter.”

Read 12 remaining paragraphs | Comments

0 comment
0 FacebookTwitterPinterestEmail

Enlarge (credit: Benj Edwards / Getty Images)

In the AI world, there’s a buzz in the air about a new AI language model released Tuesday by Meta: Llama 3.1 405B. The reason? It’s potentially the first time anyone can download a GPT-4-class large language model (LLM) for free and run it on their own hardware. You’ll still need some beefy hardware: Meta says it can run on a “single server node,” which isn’t desktop PC-grade equipment. But it’s a provocative shot across the bow of “closed” AI model vendors such as OpenAI and Anthropic.

“Llama 3.1 405B is the first openly available model that rivals the top AI models when it comes to state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation,” says Meta. Company CEO Mark Zuckerberg calls 405B “the first frontier-level open source AI model.”

In the AI industry, “frontier model” is a term for an AI system designed to push the boundaries of current capabilities. In this case, Meta is positioning 405B among the likes of the industry’s top AI models, such as OpenAI’s GPT-4o, Claude’s 3.5 Sonnet, and Google Gemini 1.5 Pro.

Read 14 remaining paragraphs | Comments

0 comment
0 FacebookTwitterPinterestEmail

Enlarge / A bad update to CrowdStrike’s Falcon security software crashed millions of Windows PCs last week. (credit: CrowdStrike)

By Monday morning, many of the major disruptions from the flawed CrowdStrike security update late last week had cleared up. Flight delays and cancellations were no longer front-page news, and multiple Starbucks locations near me are taking orders through the app once again.

But the cleanup effort continues. Microsoft estimates that around 8.5 million Windows systems were affected by the issue, which involved a buggy .sys file that was automatically pushed to Windows PCs running the CrowdStrike Falcon security software. Once downloaded, that update caused Windows systems to display the dreaded Blue Screen of Death and enter a boot loop.

“While software updates may occasionally cause disturbances, significant incidents like the CrowdStrike event are infrequent,” wrote Microsoft VP of Enterprise and OS Security David Weston in a blog post. “We currently estimate that CrowdStrike’s update affected 8.5 million Windows devices, or less than one percent of all Windows machines. While the percentage was small, the broad economic and societal impacts reflect the use of CrowdStrike by enterprises that run many critical services.”

Read 7 remaining paragraphs | Comments

0 comment
0 FacebookTwitterPinterestEmail