Stay updated with the latest in technology, global innovations, and key economic trends. From AI breakthroughs to global energy market insights, we bring you the news that matters.
Google Launches Gemini 3.5 Live Translate Real-Time Audio Translation with Near-Zero Latency.
Get link
Facebook
X
Pinterest
Email
Other Apps
-
Google Debuts 'Gemini 3.5 Live Translate': A Zero-Latency, Real-Time Audio Translation Engine Supporting 70+ Languages
Google has officially announced Gemini 3.5 Live Translate, its next-generation artificial intelligence model meticulously engineered for real-time, bidirectional voice translation. Shifting away from legacy "cascade" translation architectures which force the system to wait for a speaker to completely finish a sentence before processing this new model operates on a continuous streaming pipeline, slashing operational delays to mere milliseconds.
The model’s core strength lies in its native multimodal design. Gemini 3.5 Live Translate can automatically identify and process conversational inputs across more than 70 global languages without requiring manual configuration.
Furthermore, Google has integrated advanced neural audio filtering to effectively isolate human speech from chaotic ambient noise. The resulting audio output is remarkably human-like; the AI dynamically preserves the speaker’s original vocal tone, emotional inflections, and natural speech rhythms, creating a seamless and organic conversational flow.
Google is rolling out the new translation engine across several deployment tiers starting today:
Developers: Available immediately in Public Preview via the Gemini Live API and within Google AI Studio.
Enterprise Users: Integrated directly into Google Meet as a limited-access pilot program, allowing corporate clients to test real-time translated captions and voice dubbing before a wider corporate expansion.
General Consumers: Accessible globally through the native Google Translate app under the revamped Live Translate feature.
Android-Exclusive Upgrades: Android users gain access to a specialized utility called Listening Mode. This feature streams incoming audio through the phone's microphone, processes the translation silently, and routes the output discreetly, allowing users to listen to the translation privately without needing a headset or broadcasting the audio via public speakers.
The near-zero latency achieved by Gemini 3.5 Live Translate is due to the technological transition from the previous STT -> Translation -> TTS (Transcription to Text -> Text Synthesis to Speech) model to Direct Speech-to-Speech Translation (S2ST). The model directly converts the source language's speech waves to the target language's speech waves at the neural layer, eliminating the need for prior text conversion. This results in seamless translation, much like having a personal interpreter whispering to you.
While the app is generally available, advanced features like Listening Mode on Android perform optimally on flagship devices with powerful on-device NPUs. Analysts also note the significant impact of launching such a high-speed Live Translate model. This paves the way for backend software systems to support new Google Smart Glasses or Pixel Buds, which focus on real-time language translation through eye and voice without needing to look at the phone screen.
In the Listening Mode feature for general users, Google has designed a security architecture so that audio data recorded through microphones in public places is processed and immediately destroyed in volatile memory, without saving audio files to the cloud or keeping audio logs. This ensures maximum user peace of mind regarding privacy guidelines.
Ask me anything about this article. No data is stored for your question.
Google Debuts 'Gemini 3.5 Live Translate': A Zero-Latency, Real-Time Audio Translation Engine Supporting 70+ Languages
Google has officially announced Gemini 3.5 Live Translate, its next-generation artificial intelligence model meticulously engineered for real-time, bidirectional voice translation. Shifting away from legacy "cascade" translation architectures which force the system to wait for a speaker to completely finish a sentence before processing this new model operates on a continuous streaming pipeline, slashing operational delays to mere milliseconds.
The model’s core strength lies in its native multimodal design. Gemini 3.5 Live Translate can automatically identify and process conversational inputs across more than 70 global languages without requiring manual configuration.
Furthermore, Google has integrated advanced neural audio filtering to effectively isolate human speech from chaotic ambient noise. The resulting audio output is remarkably human-like; the AI dynamically preserves the speaker’s original vocal tone, emotional inflections, and natural speech rhythms, creating a seamless and organic conversational flow.
Google is rolling out the new translation engine across several deployment tiers starting today:
Developers: Available immediately in Public Preview via the Gemini Live API and within Google AI Studio.
Enterprise Users: Integrated directly into Google Meet as a limited-access pilot program, allowing corporate clients to test real-time translated captions and voice dubbing before a wider corporate expansion.
General Consumers: Accessible globally through the native Google Translate app under the revamped Live Translate feature.
Android-Exclusive Upgrades: Android users gain access to a specialized utility called Listening Mode. This feature streams incoming audio through the phone's microphone, processes the translation silently, and routes the output discreetly, allowing users to listen to the translation privately without needing a headset or broadcasting the audio via public speakers.
The near-zero latency achieved by Gemini 3.5 Live Translate is due to the technological transition from the previous STT -> Translation -> TTS (Transcription to Text -> Text Synthesis to Speech) model to Direct Speech-to-Speech Translation (S2ST). The model directly converts the source language's speech waves to the target language's speech waves at the neural layer, eliminating the need for prior text conversion. This results in seamless translation, much like having a personal interpreter whispering to you.
While the app is generally available, advanced features like Listening Mode on Android perform optimally on flagship devices with powerful on-device NPUs. Analysts also note the significant impact of launching such a high-speed Live Translate model. This paves the way for backend software systems to support new Google Smart Glasses or Pixel Buds, which focus on real-time language translation through eye and voice without needing to look at the phone screen.
In the Listening Mode feature for general users, Google has designed a security architecture so that audio data recorded through microphones in public places is processed and immediately destroyed in volatile memory, without saving audio files to the cloud or keeping audio logs. This ensures maximum user peace of mind regarding privacy guidelines.
Alphabet Announces Massive $80 Billion Share Allocation, Anchored by a Strategic $10 Billion Investment from Warren Buffett’s Berkshire Hathaway Alphabet Inc., the parent company of Google, has officially unveiled a comprehensive capital-raising program to issue up to $80 billion in new shares. The extensive equity offering is split into three primary tranches: a $30 billion public offering comprising a mix of convertible preferred stock, Class A common stock, and Class C capital stock; and a $40 billion at-the-market (ATM) equity offering program, which the tech giant expects to initiate during the third quarter of this year. The definitive highlight of the capital restructuring is the final $10 billion tranche, which has been allocated as a targeted private placement directly to Berkshire Hathaway , the legendary investment conglomerate founded by Warren Buffett (who is no longer managing day-to-day operations). The private placement is divided equally, with Berkshire acquiring $5 b...
Apple Unveils Next-Generation Apple Intelligence at WWDC 2026, Integrating Google Gemini to Power On-Screen Awareness and Personal Context At its Worldwide Developers Conference (WWDC 2026), Apple officially previewed the next generation of Apple Intelligence . In a groundbreaking shift, Apple has re-engineered its AI architecture by blending its proprietary Apple Foundation Models with Google’s Gemini models. This hybrid network structure balances on-device processing for strict user privacy with server-side computation via Private Cloud Compute . The upgrade fundamentally focuses on deeper personal context understanding , an expansive web-based world knowledge, cross-app execution capabilities, and a highly advanced on-screen awareness . The latest Apple Intelligence upgrade brings a suite of transformative features across everyday applications: 📸 Photos: Advanced Generative Editing The native Photos app receives three major generative AI tools aimed at reimagining image compositio...
NVIDIA Forges South Korean Mega-Alliance: Partnering with SK Hynix, NAVER, LG, and Doosan to Build Gigawatt-Scale AI Infrastructure NVIDIA has officially announced a series of sweeping strategic partnerships with South Korea’s leading technology and industrial conglomerates including NAVER, SK Hynix, LG, and the Doosan Group to construct next-generation AI data centers and advance localized AI ecosystems. While the definitive financial terms for individual agreements remain undisclosed, the combined initiatives represent a massive infrastructure push in the Asia-Pacific region. The core hardware alliance is anchored by a multi-year agreement with semiconductor pioneer SK Hynix . The memory giant will co-develop next-generation high-bandwidth memory (HBM) architectures tailored specifically to power NVIDIA’s long-term hardware roadmap, spanning high-density data center servers down to consumer-facing RTX Spark PC platforms. In tandem, SK Hynix will integrate the NVIDIA Omniverse plat...
SpaceX Lands Blockbuster $30 Billion AI Compute Contract with Longtime Investor Google Just Days Before Historic Nasdaq IPO SpaceX has quietly secured a massive multi-year cloud infrastructure agreement with Alphabet's Google, locking in a staggering $920 million in monthly payments . According to a recent regulatory filing with the U.S. Securities and Exchange Commission (SEC), the contract will run from October through June 2029, representing an aggregate deal value of approximately $30 billion . The critical disclosure arrives just days before SpaceX is scheduled to make its historic debut on the Nasdaq exchange, where it seeks to shatter records with a $1.75 trillion valuation and a target raise of $75 billion. Under the mechanics of the agreement, Google Cloud will gain high-density access to an immense pool of compute architecture, leveraging approximately 110,000 Nvidia graphics processing units (GPUs) alongside paired CPUs and custom networking equipment housed within Sp...
Cloudflare Acquires VoidZero to Transform Vite into a Full-Stack Powerhouse Challenging Next.js Cloudflare has officially announced the acquisition of VoidZero, the startup behind some of the web development community's most popular open-source tooling, including Vite, Vitest, Rolldown, and Oxc. Following the acquisition, Cloudflare confirmed that all of VoidZero’s core projects will remain entirely open-source and framework-agnostic, ensuring developers can continue using them across any hosting environment. The partnership stems from Cloudflare deep existing reliance on the VoidZero ecosystem. Cloudflare already utilizes Vite to power its global Cloudflare Dashboard and leverages Oxc’s high-performance linter, Oxlint, for its internal codebase quality assurance. While the tools will remain independent of any single platform, Cloudflare plans to release deep, native integrations via its command-line interface. Developers will soon use the cf tool directly alongside Vite; for ins...
S&P 500 Rebalance: Marvell Technology and Flex Set to Join Benchmarks as Tech Displaces Legacy Staples In its latest quarterly rebalancing, S&P Dow Jones Indices (a division of S&P Global) announced that semiconductor innovator Marvell Technology Inc. (NASDAQ: MRVL) and electronics manufacturing services giant Flex Ltd. (NASDAQ: FLEX) will be added to the benchmark S&P 500 index. The changes will take effect prior to the opening bell on Monday, June 22, 2026 . To make room for these technology powerhouses, automated pool equipment distributor Pool Corp. and packaged food legacy giant The Campbell's Company (formerly Campbell Soup Co.) will be removed from the S&P 500 and repositioned into the S&P SmallCap 600. The inclusion of Marvell Technology underscores the relentless, sweeping momentum of the artificial intelligence infrastructure boom on global capital markets. Marvell, a leading developer of specialized data-infrastructure architecture, custom ap...
OpenAI Files Confidential Draft S-1 with SEC, Marking Historic First Step Toward Highly Anticipated IPO In a milestone move for the artificial intelligence sector, OpenAI announced that it has officially submitted a confidential draft registration statement on Form S-1 to the U.S. Securities and Exchange Commission (SEC). The filing represents the critical, mandatory first step required for any private enterprise looking to launch an Initial Public Offering (IPO) on Wall Street. By opting for a confidential submission , OpenAI is legally permitted to keep its detailed financial balance sheets, proprietary metrics, and internal operational performance metrics shielded from public view during the initial regulatory review cycles. Acknowledging the realities of the modern tech landscape, OpenAI humorously noted that it chose to pre-emptively announce the filing themselves, knowing the milestone would inevitably leak to the press anyway. Despite taking this foundational step, OpenAI emph...
Comments
Post a Comment