Weekly AI News

Your go-to digest for groundbreaking AI trends and innovations

Your go-to digest for groundbreaking AI trends and innovations

Hey it’s Jul,
Greetings and welcome to the sixth edition of “Weekly AI News”!

The weeks go by, and the pattern continues. Another wave of AI breakthroughs has arrived, with 47 pieces of news covered—our new record.

On one side, OpenAI and Anthropic push their models further: GPT-4.5 “Orion” focuses on emotional intelligence (but remains costly), while Claude 3.7 Sonnet introduces “hybrid reasoning” for faster and deeper responses. Yet, both struggle—OpenAI lacks GPUs, and Anthropic faces rising compute costs.

Meanwhile, Amazon, Microsoft, and Apple intensify their AI race: Alexa+ gains advanced features, Copilot removes usage limits, and Apple invests $500B in AI infrastructure—all vying to dominate the future of services.

Meanwhile, Asian competitors are hardly on the sidelines. Alibaba and Tencent unveil QwQ-Max-Preview and Hunyuan Turbo S, while DeepSeek expands its open-source efforts. Even Meta is making a push into the Chinese market with open models like Wan2.1.

Yet, AI’s societal impact sparks debate. Elon Musk censors Grok 3, contradicting his “free speech” stance. 80% of U.S. employees doubt AI’s workplace value, and in the UK, Elton John and Paul McCartney fight for stronger copyright protections.

Two AI ecosystems are emerging: one commercial and competitive, the other collaborative and open-source. The industry’s challenge? Balancing innovation, ethics, and global governance—because AI isn’t just reshaping tech, it’s redefining how we work, create, and interact with the world.

Happy reading!

🤖✨ Anthropic unveils Claude 3.7 Sonnet, the first-ever “hybrid reasoning model”

Anthropic has introduced Claude 3.7 Sonnet, the first AI that combines instant responses with controllable extended thinking. Users can switch between a fast mode and an “extended thinking” mode that shows the model’s chain of thought. The API lets you fine-tune how long Claude thinks—up to 128K tokens—to balance speed, cost, and quality per task. Claude 3.7 also boasts state-of-the-art coding benchmarks, surpassing rivals like o1 and o3-mini.

🤖🧩 Claude Code, Anthropic’s new coding assistant

Anthropic unveiled Claude Code, a command-line coding agent capable of editing files, reading code, and running tests. It’s designed to compete with Anysphere and Replit—both already using Anthropic’s tech. Access remains limited, but Anthropic has beaten OpenAI to a standalone coding tool. Coding comprises 15–25% of Claude’s usage, signaling a key focus for the company.

🤖🚀 Anthropic’s three-stage roadmap for Claude

Anthropic outlined its vision in three phases: In 2024, Claude “assists” people in improving their current tasks; in 2025, Claude “collaborates” for hours of independent work at expert levels; and by 2027, Claude “pioneers” breakthrough solutions that take human teams years to achieve. This bold plan aims to drastically amplify Claude’s intelligence and creative capacity.

🤖🎮 Claude plays Pokémon Red live

Anthropic showcased Claude 3.7 Sonnet tackling Pokémon Red on Twitch, defeating three Gym Leaders—while the older version struggled to leave the starting town. The livestream displays Claude’s “thought process” alongside the real-time game. Equipped with a knowledge base, function calling, and vision, 3.7 Sonnet navigates more effectively, illustrating future potential for AI in gaming and entertainment.

🤖💬 Amazon unveils Alexa+

The new Alexa+ is more conversational, powered by Amazon’s Nova and Anthropic’s Claude models. Prime members get it free, while non-subscribers pay $19.99/month. It handles tasks like ordering groceries, booking reservations, and purchasing tickets, and can even analyse documents. Rolling out first to Echo Show devices in March, Alexa+ will expand to nearly all Alexa-enabled platforms.

🤖💖 OpenAI releases GPT-4.5 with emotional intelligence

Nicknamed “Orion,” GPT-4.5 focuses on richer conversational flow and deeper emotional understanding. It’s less prone to hallucinations and more accurate overall, though it shows limited gains in math or science. Available first to Pro and paid accounts with a steep API price, GPT-4.5 may be the last purely “non-reasoning” upgrade before OpenAI moves to a hybrid reasoner.

🤖🌏 Operator AI expands to more countries

OpenAI’s recently released Operator AI agent is now available for Pro users in Australia, Brazil, Canada, India, Japan, Singapore, South Korea, the UK, and other regions that support ChatGPT. Due to regulatory and GPU constraints, availability in the EU and a few European countries remains pending. However, Pro users in other parts of the world can start testing Operator now.

🤖🔍 OpenAI’s “deep research” rolls out to paid subscribers

OpenAI is extending its advanced web browsing agent, deep research, to all ChatGPT paying users (Plus, Team, Enterprise, Edu), allowing 10 monthly queries. Previously exclusive to ChatGPT Pro users at $200/month, the agent generates thorough reports from web data. Pro users now get 120 monthly queries, up from 100.

🤖⛔ OpenAI CEO Altman confirms a GPU shortage

Sam Altman reveals that GPT-4.5’s release is staggered because OpenAI is “out of GPUs.” The enormous model requires tens of thousands more GPUs, so initial access is limited to Pro subscribers, with Plus users next week. GPT-4.5’s API cost is extremely high at $75 per million input tokens and $150 per million output tokens, reflecting the strain on OpenAI’s hardware resources.

🤖🌐 OpenAI targets superintelligence by 2027

Chief Product Officer Kevin Weil predicts that artificial superintelligence could arrive within four years. OpenAI’s plan calls for massive investments: $13 billion in Microsoft servers by 2025, $28 billion by 2028, and up to $325 billion in compute through 2030. Profits are expected only after launching the next-gen “Stargate” system in 2030.

🤖🏭 Microsoft readies for an AI-powered future

CEO Satya Nadella believes no single firm will monopolise AI, anticipating open-source projects to prevent a winner-takes-all scenario. Microsoft is investing heavily in building and leasing data centers, likening it to the dotcom-era fiber overbuild. While OpenAI chases superintelligence, Microsoft plans to be the key infrastructure provider for the entire AI sector.

🤖🎙 Microsoft removes Copilot usage limits

Microsoft has eliminated usage caps on Copilot’s voice feature and the embedded o1 “Think Deeper” reasoning model. Previously, free users faced restrictions, but now they can leverage unlimited voice interactions and in-depth reasoning. This move aims to broaden Copilot’s user base and encourage more extensive AI-driven conversations.

🤖💡 Microsoft debuts Phi-4 and Phi-4 mini

Microsoft’s new multimodal Phi-4 models match or exceed rivals twice their size on certain tasks. Phi-4 processes text, images, and more, while Phi-4 mini offers a compact version with comparable capabilities. These models continue Microsoft’s push to optimise performance across a variety of resource-constrained environments.

🤖🍏 Copilot arrives on macOS

Microsoft’s Copilot AI app is now available on the Mac App Store. It requires macOS 14.0 and an Apple Silicon chip. Previously, Copilot was limited to iOS and iPad; last year’s Mac workaround used the iPad version. Now, an official native build lets Mac users enjoy Copilot features without reliance on mobile emulation.

🤖🔥 Grok 3 sparks controversy at xAI

xAI’s Grok 3 model initially produced critical statements about Donald Trump and Elon Musk despite Musk’s promise of a “maximally truth-seeking” AI. xAI engineers quickly patched the system to avoid those topics, raising concerns about censorship and contradictory stances on free expression. The incident highlights challenges in balancing open dialogue with brand or personal sensitivities.

🤖🤖 Grok 3 leaks its system prompt

Users discovered that asking Grok 3 to reveal its reasoning triggered instructions to avoid linking Trump or Musk to misinformation. This glitch underscores the tension between transparent AI “thought processes” and behind-the-scenes moderation. Reddit tests confirm that Grok 3 inadvertently exposes xAI’s internal censorship approach.

🤖📱 Meta plans a standalone Meta AI app

Meta is rumoured to be developing a dedicated app for its Meta AI assistant, which would compete directly with ChatGPT, Anthropic, and Google. The app might include a premium subscription tier. CEO Mark Zuckerberg aims for Meta AI to become “the leading AI assistant,” expanding usage beyond the confines of existing social media apps.

🤖🌍 Meta AI in Arabic

Meta AI has launched across 10 Middle Eastern and North African countries, offering text generation, image creation, and animations in Arabic. This rollout underscores Meta’s drive for a global audience and highlights the rapid growth of AI capabilities in non-English markets.

🤖👓 Meta’s experimental Aria Gen 2 smart glasses

Meta unveiled its latest Aria Gen 2 research glasses, designed to advance AI, robotics, and machine perception studies. Researchers can capture real-world data for next-gen AR and AI development. Meta envisions smart glasses as a future computing paradigm, aiming to expand beyond smartphones as the default platform.

🤖🏗 Meta considers a $200B AI data center project

Meta is discussing plans to build a massive AI-focused data center campus that would dwarf any prior infrastructure project. Unofficial estimates place costs above $200 billion, far exceeding the scale of Meta’s other ongoing data center expansions. The move underscores Meta’s colossal ambitions to power advanced AI initiatives.

🤖🎞 Hugging Face unveils SmolVLM2

SmolVLM2 is a tiny yet powerful family of models that can understand and analyse videos on basic devices like phones or laptops. With as few as 256M parameters, it delivers performance on par with significantly larger systems. Demonstrations include local video analysis on iPhones, paving the way for on-device AI without cloud reliance.

🤖📈 NVIDIA’s Jensen Huang clarifies DeepSeek’s impact

NVIDIA’s CEO stresses that DeepSeek hasn’t obviated the need for NVIDIA GPUs. While DeepSeek focuses on certain phases of AI workflows, post-training and other critical steps still depend on GPU infrastructure. Investors concerned about reduced GPU demand can rest assured that industry-wide AI adoption remains strong.

🤖🖥 Google’s free Gemini Code Assist

Google launched a free tier of Gemini Code Assist for individual developers with generous usage limits—up to 180,000 code completions per month. It is powered by a fine-tuned Gemini 2.0 model, featuring a 128,000-token context window. Integrated with VS Code, GitHub, and JetBrains, it challenges incumbents like GitHub Copilot with a more accessible option.

🤖🔎 Google Veo 2 lands on Freepik

Freepik just added Google’s Veo 2 to its arsenal of AI models, including MiniMax, Runway Gen 3, and more. Currently limited to 5-second text-to-video at 720p, 4K is on the way. However, each Veo 2 generation costs 1,000 credits—8x pricier than Kling standard—making it a premium feature on the platform.

🤖💰 Veo 2 cheaper on Fal AI

Despite Google’s official rate of $0.50 per second, Fal AI is charging $1.25 for 5 seconds plus $0.25 per extra second—half the direct Google pricing. This raises eyebrows about Google’s varied pricing approach. Meanwhile, Vertex AI still maintains the higher $0.50/second cost for Veo 2 usage.

🤖🏢 Sergey Brin urges Google staff back to office

Google cofounder Sergey Brin informally encouraged AI-focused employees to work onsite “at least every weekday” and put in 60-hour weeks. Though it’s not replacing the official three-day in-office policy, it signifies urgency from Google’s top brass in the race for AGI against startups like OpenAI and Anthropic.

🤖💬 ElevenLabs introduces Scribe

Scribe is a speech-to-text model that claims higher accuracy than Google’s Gemini 2.0 Flash and OpenAI’s Whisper v3. It supports 99 languages and can label multiple speakers, offer word-level timestamps, and detect non-speech audio cues like laughter. Priced at $0.40/hour for recorded audio, a low-latency version for real-time transcription is forthcoming.

🤖⚡ Ideogram 2a cuts image generation time

Ideogram’s new 2a model can generate images in around 10 seconds, with “2a Turbo” doubling that speed. It excels at graphic design, text generation, and photorealism, at 50% lower costs than its predecessor. Available via web, API, or third-party apps, it’s a major step forward in rapid creative content generation.

🤖📹 Pika Labs 2.2 boosts video quality

The latest Pika Labs model offers 1080p output, extends clip length to 10 seconds, and introduces transitions and transformations. This upgrade aims for more immersive, smoother content. The enhancements push Pika Labs closer to a professional-grade video generator in the fast-evolving AI video landscape.

🤖🧮 Alibaba’s QwQ-Max-Preview for advanced reasoning

Alibaba’s Qwen team rolled out QwQ-Max-Preview, based on Qwen2.5-Max but fine-tuned for complex math, coding, and “agentic” tasks. Its “Thinking (QwQ)” option in Qwen Chat reveals step-by-step reasoning. Alibaba plans to open-source QwQ-Max under an Apache 2.0 license, including a 32B variant for local deployment, challenging premium proprietary reasoning models.

🤖📂 DeepSeek to open-source five new repositories

On the heels of its successful R1 reasoning model with 22M daily users, DeepSeek announces five more open-source projects. This move cements its community-driven approach and encourages global collaboration. It also highlights DeepSeek’s commitment to democratising state-of-the-art AI developments.

🤖⚡ Tencent launches Hunyuan Turbo S

Hunyuan Turbo S favors speed over deep reflection while achieving parity with top-tier models like DeepSeek V3 and GPT-4o. Priced lower than its predecessors, Turbo S suits tasks needing quick outputs. Tencent also teased T1, a “deep thinking” model for heavy reasoning, indicating a two-pronged strategy in AI deployment.

🤖📽 Alibaba’s Wan2.1 goes open source

Alibaba’s Tongyi Lab released Wan2.1, a suite of cutting-edge video generation models outperforming Sora on VBench while being 2.5x faster. They include text-to-video, image-to-video, and video-to-audio, plus the ability to render text in English or Chinese. A lightweight 1.3B variant can generate a five-second 480p clip on an RTX 4090 in about four minutes.

🤖🕺 Animate Anyone 2 for full-body swaps

Alibaba’s Animate Anyone 2 enables users to replace a person in a video using just one reference photo. Unlike 3D-driven approaches like Wonder Dynamics, it’s purely 2D-based, reminiscent of Viggle in early demos. The technique yields realistic transformations, though technical details and code remain under wraps.

🤖💰 Alibaba to invest $53B in cloud and AI

Alibaba plans to spend 380 billion yuan over three years—exceeding its previous decade’s total—on cloud computing and AI. It aims to capitalise on China’s accelerating demand for AI, especially in its cloud business. CEO Eddie Wu says AI-driven services are Alibaba’s clearest path to new revenue streams amid flattening e-commerce growth.

🤖⚙ Replit releases Agent V2

Agent V2 promises more autonomy in building apps from a single prompt, producing slicker UI designs. It relies on Claude 3.7 Sonnet to generate better-structured code. Meanwhile, Replit Assistant and other “build with AI” platforms (Bolt, Create, Windsurf) have also adopted Claude 3.7 Sonnet, reflecting its growing popularity in coding workflows.

🤖🍏 Apple invests $500B in AI

Apple plans to create 20,000 U.S. R&D jobs and expand data centers in multiple states—plus build a 250,000 sq ft Houston facility for AI servers with Foxconn. This investment underpins “Apple Intelligence,” fueling future AI-driven products. Apple is betting big on strengthening its AI infrastructure for the long haul.

🤖❓ Apple’s TTS confuses “racist” with “Trump”

iPhone Dictation sometimes transcribed “racist” as “Trump,” prompting Apple to attribute it to difficulty processing words with “r.” Experts, however, question that explanation, suggesting code-level tweaks instead. Apple promises a swift fix, illustrating the unpredictable nature of speech recognition errors.

🤖🚀 Inception Labs debuts Mercury

Mercury is a “diffusion LLM” that can generate text up to 10x faster than standard LLMs—reaching 1,000 tokens/sec on typical GPUs. Mercury Coder matches or surpasses models like GPT-4o Mini or Claude 3.5 Haiku in code benchmarks at higher speeds. Founded by Stanford’s Stefano Ermon, Inception claims diffusion-based language generation offers major efficiency gains.

🤖🤝 Snowflake partners with Microsoft for OpenAI

Snowflake expands its Microsoft collaboration so customers can access OpenAI through Azure without leaving Snowflake’s environment. This arrangement mirrors Snowflake’s arrangement with Anthropic and illustrates how enterprise software providers bridge corporate data with leading AI models. Customers benefit from a single Snowflake bill and stronger data protections.

🤖📲 Poe introduces user-made AI apps

Poe’s “Poe Apps” feature lets users create visual interfaces on top of various AI models, from GPT-4o to Google’s Veo 2. A new App Creator powered by Claude 3.7 Sonnet converts descriptions into JavaScript code. Poe Apps can be shared publicly, with usage tied to Poe’s point system, bridging the gap between casual exploration and real AI development.

🤖🎼 Hume AI unveils Octave, emotion-aware TTS

Octave generates speech that reflects emotional context, allowing users to prompt custom voices and specify tones like whisper, sarcasm, or joy. It differs from standard TTS by interpreting meaning to modulate delivery. Hume AI provides a Creator Studio to produce long-form voice content with advanced emotional nuances.

🤖🔊 Gibber Link: sound-based AI communication

Two developers have introduced “Gibber Link,” enabling AI agents to detect each other on calls and switch from human speech to direct data-over-sound transmissions. Using “ggwave,” it reduces compute costs by 90% and shortens communication times by 80%. While slow compared to modern data rates, it’s a glimpse of new, more efficient AI-to-AI channels.

🤖🏠 1X launches NEO Gamma humanoid for the home

Norway’s 1X robotics reveals NEO Gamma, a domestically oriented humanoid robot that can walk, squat, sit, and handle tasks like cleaning or serving. It features “Emotive Ear Rings” and a soft, knitted nylon exterior for safety. Its quieter operation and friendly design aim to blend seamlessly into everyday home environments.

🤖🇺🇸 Americans doubt workplace AI benefits

A Pew Research Center survey finds that 80% of U.S. workers don’t use AI on the job and remain skeptical of its value. Under one-third feel excited about AI’s future in the workplace, and only 6% believe AI will create more opportunities for them. The data reflects a stark gap between Silicon Valley’s enthusiasm and public sentiment.

🤖🇬🇧 Elton John opposes UK “opt-out” AI copyright plan

Music legend Elton John wants mandatory permission for AI firms using artists’ works, rather than an opt-out system. He fears that automatically allowing copyrighted material for AI training undermines creators’ rights. Other iconic British musicians, including Paul McCartney, have also voiced concerns over potential loss of control.

🤖🔇 Silent album protests UK’s AI copyright changes

Over 1,000 British musicians—including Kate Bush, Annie Lennox, and Damon Albarn—released “Is This What We Want?”, an album of near-silence protesting proposed “opt-out” AI training rules. They argue the changes threaten creative ownership, symbolised by the empty studios on the record. Elton John and Paul McCartney likewise oppose the plan.

Community

Join AI Whisperer Community!

Ready to take your AI journey to the next level? Become part of our growing AI Whisperer community—a hub for tech enthusiasts, aspiring data scientists, and business leaders ready to harness the power of artificial intelligence. Inside, you’ll find:

  • Community: Like-minded individuals keen to grow together

  • Curated AI Tools: Discover the latest and greatest tools to boost productivity.

  • Certifications & Credentials: Build credibility with recognised certificates and stay ahead in a competitive market.

Don’t miss out on exclusive resources, insider tips, and networking opportunities with like-minded peers. Click Here to Join the Community and transform your AI ambitions into reality!

That's it for this week!
Until next time, stay curious and keep exploring the ever-evolving world of AI!
Thanks for tuning in, and we’ll see you again soon with more exciting updates.

Jul