Explore the latest engineering posts and summaries

Search by topic, company, or concept and scan results quickly.

Posts indexed431

Last indexedMar 14, 2026

Microsoft Azure BlogJan 26, 2026

Maia 200: The AI accelerator built for inference

Why it matters: Maia 200 represents a shift toward custom first-party silicon optimized for LLM inference. It offers engineers high-performance FP4/FP8 compute and a flexible software stack, significantly reducing the cost and latency of deploying massive models like GPT-5.2 at scale.

Maia 200 is built on a TSMC 3nm process, featuring 140 billion transistors and delivering 10 petaFLOPS of FP4 and 5 petaFLOPS of FP8 performance.
The memory architecture utilizes 216GB of HBM3e at 7 TB/s alongside 272MB of on-chip SRAM to maximize token generation throughput.
It employs a custom Ethernet-based scale-up network providing 2.8 TB/s of bidirectional bandwidth for clusters of up to 6,144 accelerators.
The software ecosystem includes the Maia SDK with a Triton compiler, PyTorch integration, and a low-level programming language (NPL).
Engineered for efficiency, it achieves 30% better performance per dollar than existing hardware for models like GPT-5.2 and synthetic data generation.

#mlp #dist #finops

Read original

Cloudflare BlogJan 26, 2026

Cable cuts, storms, and DNS: a look at Internet disruptions in Q4 2025

Why it matters: Understanding global connectivity disruptions helps engineers build more resilient, multi-homed architectures. It highlights the fragility of physical infrastructure like submarine cables and the impact of BGP routing and government policy on service availability.

Q4 2025 saw over 180 global Internet disruptions caused by government mandates, physical infrastructure damage, and technical failures.
Tanzania implemented a near-total Internet shutdown during its presidential election, resulting in a 90% traffic drop and fluctuations in BGP address space announcements.
Submarine cable cuts, specifically to the PEACE and WACS systems, significantly impacted connectivity in Pakistan and Cameroon.
Infrastructure vulnerabilities in Haiti led to multiple outages for Digicel users due to international fiber optic cuts.
Beyond physical damage, disruptions were linked to hyperscaler cloud platform issues and ongoing military conflicts affecting regional network stability.

#sre #dist #culture

Read original

Cloudflare BlogJan 23, 2026

Route leak incident on January 22, 2026

Why it matters: This incident highlights how minor automation errors in BGP policy configuration can cause global traffic disruptions. It underscores the risks of permissive routing filters and the importance of robust validation in network automation to prevent large-scale route leaks.

An automated routing policy change intended to remove IPv6 prefix advertisements for a Bogotá data center caused a major BGP route leak in Miami.
The removal of specific prefix lists from policy statements resulted in overly permissive terms, unintentionally redistributing peer routes to other providers.
The incident lasted 25 minutes, causing significant congestion on Miami backbone infrastructure and affecting both Cloudflare customers and external parties.
The leak was classified as a mixture of Type 3 and Type 4 route leaks according to RFC7908, violating standard valley-free routing principles.
Impact was limited to IPv6 traffic and was mitigated by manually reverting the configuration and pausing the automation platform.

#sre #dist

Read original

Salesforce EngineeringJan 22, 2026

How Agentforce, Data, and Apps Turned the Salesforce Stack into Agentforce 360

Why it matters: This article details the architectural shift from fragmented point solutions to a unified AI stack. It provides a blueprint for solving data consistency and metadata scaling challenges, essential for engineers building reliable, real-time agentic systems at enterprise scale.

Salesforce unified its data, agent, and application layers into the Agentforce 360 stack to ensure consistent context and reasoning across all surfaces.
The platform uses Data 360 as a universal semantic model, harmonizing signals from streaming, batch, and zero-copy sources into a single plane of glass.
Engineers addressed metadata scaling by treating metadata as data, enabling efficient indexing and retrieval for massive entity volumes.
A harmonization metamodel defines mappings and transformations to generate canonical customer profiles from heterogeneous data sources.
The architecture centralizes freshness and ingest control to maintain identical answers across different AI agents and applications.
Real-time event correlation is optimized to update unified context immediately while balancing storage costs for large-scale personalization.

#data #mlp #dist

Read original

Microsoft Azure BlogJan 22, 2026

Beyond boundaries: The future of Azure Storage in 2026

Why it matters: Azure Storage is shifting from passive storage to an active, AI-optimized platform. Engineers must understand these scale and performance improvements to architect systems capable of handling the high-concurrency, high-throughput demands of autonomous agents and LLM lifecycles.

Azure Storage is evolving into a unified platform supporting the full AI lifecycle, from frontier model training to large-scale inferencing and agentic applications.
Blob scaled accounts now support millions of objects across hundreds of scale units, enabling massive datasets for training and tuning.
Azure Managed Lustre (AMLFS) has expanded to support 25 PiB namespaces and 512 GBps throughput to maximize GPU utilization in high-performance computing.
Deep integration with frameworks like Microsoft Foundry, Ray, and LangChain facilitates seamless data grounding and low-latency context serving for RAG architectures.
Elastic SAN and Azure Container Storage (ACStor) are being optimized for 'agentic scale' to handle the high concurrency and query volume of autonomous agents.
New storage tiers and performance updates, such as Premium SSD v2 and Cold/Archive tiers for Azure Files, focus on reducing TCO for mission-critical workloads.

#data #mlp #dist

Read original

GitHub EngineeringJan 22, 2026

Build an agent into any app with the GitHub Copilot SDK

Why it matters: Building agentic workflows is difficult due to the complexity of context management and tool orchestration. This SDK abstracts those infrastructure hurdles, allowing engineers to focus on product logic while leveraging a production-tested agentic loop.

GitHub released the Copilot SDK in technical preview, enabling developers to embed the Copilot agentic core into custom applications.
The SDK provides programmatic access to the same execution loop used by Copilot CLI, including planning, tool orchestration, and multi-turn context management.
It supports major programming environments including Node.js, Python, Go, and .NET, with built-in support for GitHub authentication.
Key features include Model Context Protocol (MCP) server integration, custom tool definitions, and real-time streaming capabilities.
Developers can leverage existing Copilot subscriptions or provide their own API keys to power the agentic workflows.

#mlp #frontend

Read original

Spotify EngineeringJan 22, 2026

Congratulations to the recipients of the 2025 Spotify FOSS Fund

Why it matters: Supporting open-source sustainability is crucial for the reliability of modern software stacks. This initiative demonstrates how large engineering organizations can mitigate supply chain risks and ensure the longevity of critical dependencies.

Spotify has announced the 2025 recipients of its Free and Open Source Software (FOSS) Fund.
The fund was established in 2022 to provide financial support to critical open source projects that Spotify relies on.
The initiative aims to ensure the long-term sustainability and health of the global open source ecosystem.
This program highlights the importance of corporate responsibility in maintaining the software infrastructure used by millions.

#culture #sre

Read original

GitHub EngineeringJan 21, 2026

A cheat sheet to slash commands in GitHub Copilot CLI

Why it matters: Slash commands transform the Copilot CLI from a chat interface into a precise developer tool. By providing predictable, keyboard-driven shortcuts for context management and model selection, they minimize context switching and improve the reliability of AI-assisted terminal workflows.

Slash commands provide explicit, repeatable instructions in the GitHub Copilot CLI, reducing the need for complex natural language prompting.
Commands like /clear and /cwd allow developers to manage conversation history and directory scoping to prevent context bleed.
The /model command enables switching between different AI models to optimize for speed or reasoning depth based on the task.
Security and compliance are enhanced through commands like /add-dir and /list-dirs, which define clear boundaries for file access.
Advanced features include /mcp for connecting Model Context Protocol servers and /delegate for offloading tasks to specialized agents.
The CLI supports session management and usage tracking via /session and /usage commands to monitor resource consumption.

#mlp #security #culture

Read original

Salesforce EngineeringJan 21, 2026

How Agentforce Runs Secure AI Agents at 11 Million Calls Per Day

Why it matters: Securing AI agents at scale requires balancing rapid innovation with enterprise-grade protection. This architecture demonstrates how to manage 11M+ daily calls by decoupling security layers, ensuring multi-tenant reliability, and maintaining request integrity across distributed systems.

Salesforce's Developer Access team manages a secure access plane for Agentforce, handling over 11 million daily agent calls across production environments.
The architecture utilizes a layered access-control plane that separates authentication at the edge from authorization within the core platform to reduce latency and operational risk.
A middle-layer API service acts as a technical control point, ensuring all agentic traffic follows consistent security protocols and cannot bypass protection boundaries.
Security invariants include edge-level authentication validation, core-platform-enforced authorization, and end-to-end request integrity using Salesforce-minted tokens.
The system is designed to contain multi-tenant blast radius risks, preventing runaway agents or malformed requests from impacting other customers in a shared environment.
Strict egress traffic filtering and cross-boundary revalidation are employed to maintain the principle of least privilege across the distributed compute layer.

#security #dist #sre

Read original

GitHub EngineeringJan 20, 2026

AI-supported vulnerability triage with the GitHub Security Lab Taskflow Agent

Why it matters: Triaging security alerts is often manual and repetitive. This framework allows engineers to automate human-like reasoning to filter false positives at scale, combining the precision of CodeQL with the pattern-matching flexibility of LLMs to find real vulnerabilities faster.

GitHub Security Lab introduced the Taskflow Agent, an open-source framework for automating security research and vulnerability triage using LLMs.
Taskflows are defined in YAML files, breaking complex audits into smaller, sequential tasks to overcome LLM context window limitations and improve accuracy.
The framework utilizes Model Context Protocol (MCP) servers to perform conventional programming tasks like file fetching and searching alongside AI reasoning.
It supports asynchronous batch processing, allowing engineers to apply templated audit logic across numerous CodeQL alerts simultaneously.
Real-world application of the tool successfully identified approximately 30 vulnerabilities by filtering out false positives that traditional static analysis tools struggle to detect.

#security #mlp

Read original

Page 15 of 44

Prev 1...13 14 15 16 17...44 Next