Curated topic

mlp

Posts tagged with mlp

Cloudflare BlogJun 15, 2026

Growing the Cloudflare AI team with talent from Ensemble AI

Why it matters: Reducing AI inference costs and memory overhead is critical for scaling apps. Integrating architectural-level model compression like NdLinear allows Cloudflare to run complex LLMs more efficiently on its global network, improving performance and economic viability for developers.

Cloudflare is integrating the Ensemble AI team to enhance its AI infrastructure and improve inference efficiency.
Ensemble AI developed NdLinear, a drop-in replacement for standard linear layers that reduces parameter counts by operating on multidimensional activations.
The team introduced NdLinear-LoRA, an efficient adaptation method designed to reduce the trainable parameters required for fine-tuning large models.
These architectural optimizations aim to reduce memory, compute, and deployment overhead for LLMs and multimodal models.
The technology will be integrated into Cloudflare Workers AI, complementing existing tools like the Infire inference engine and Unweight tensor compression.
The focus is on improving the economics of AI by optimizing GPU utilization and enabling scalable, globally distributed inference.

#mlp #dist #finops

Read original

GitHub EngineeringJun 12, 2026

How we made GitHub Copilot CLI more selective about delegation

Why it matters: Optimizing agentic delegation is critical for reducing latency and failure rates in AI tools. This research shows that more delegation isn't always better; selective orchestration improves reliability and speed by minimizing handoff friction and redundant tool calls.

GitHub Copilot CLI introduced smarter subagent delegation to reduce coordination overhead and latency in agentic workflows.
The system now prioritizes direct execution by the main agent for simple tasks, reserving subagents for complex or parallelizable work.
Production A/B testing resulted in a 23% reduction in tool failures and a 5% improvement in P95 user wait time.
The team used LLM-based trajectory analysis to identify bottlenecks where subagents were performing redundant repository searches.
The orchestration policy was refined to treat subagents as tools for parallelism rather than default handlers for all tasks.
Developers can access these performance improvements by updating to Copilot CLI version 1.0.42 or later.

#mlp #dist

Read original

Dropbox Tech BlogJun 12, 2026

How Dropbox uses MCP and Dash to close the design-to-code security gap

Why it matters: This approach solves the persistent problem of security requirements getting lost during long development cycles. By using MCP and AI to bridge the gap between documentation and code, engineers ensure critical threat mitigations are implemented without manual overhead or human error.

Dropbox identified a significant design-to-code gap where security threat models are often disconnected from actual implementation in pull requests.
Data analysis revealed that only 12% of PRs link back to original security reviews, with a median delay of five weeks between design and implementation.
The solution leverages the Model Context Protocol (MCP) to allow AI agents to access indexed documentation via Dash, Dropbox's AI-powered search tool.
The system automatically retrieves relevant threat models during code review and evaluates if changes align with previously agreed-upon security mitigations.
By using Dash's MCP server, the agent draws on years of security reviews and engineering docs without requiring manual linking or custom integrations.
This approach helps identify security-sensitive work that might have missed review, providing an early signal during the development lifecycle.

#security #mlp

Read original

Salesforce EngineeringJun 11, 2026

How MuleSoft Is Raising the Trust Bar for AI-Generated Code

Why it matters: As AI-generated code accelerates development, traditional manual reviews can't keep up. MuleSoft’s Golden Gate provides a scalable model for automated, AI-powered PR governance that maintains high security and trust without slowing down developer velocity or increasing false positives.

MuleSoft launched Golden Gate, an AI-powered PR-time governance system to enforce security and compliance at the merge boundary.
The system addresses the volume and velocity challenges of agentic development by shifting enforcement left into developer workflows.
To minimize false positives, every AI skill undergoes a deterministic validation pipeline involving backtesting against real PR history.
New governance skills are deployed in advisory mode first to build developer trust before becoming mandatory merge-blocking gates.
The platform achieved a false positive rate of under 0.5% across 77,000 executions, demonstrating high-signal reliability at scale.

#security #mlp #culture

Read original

GitHub EngineeringJun 11, 2026

Making secret scanning more trustworthy: Reducing false positives at scale

Why it matters: False positives in security tools cause alert fatigue and erode developer trust. By using LLMs to understand code context, GitHub reduces noise by over 75%, ensuring engineers spend time fixing real vulnerabilities rather than triaging non-sensitive strings.

GitHub integrated LLM-based contextual reasoning into its secret scanning pipeline to significantly reduce false positive alerts.
The system moves beyond simple pattern matching by analyzing how a potential secret is used, such as being passed to an API request or authentication header.
By focusing on 'better context' rather than 'more context,' the verification step analyzes high-signal information within a single file to maintain low latency.
The approach was developed in collaboration with Microsoft Security & AI, leveraging techniques from the Agentic Secret Finder system.
The implementation achieved a 75.76% reduction in customer-confirmed false positives, exceeding the initial target of 65%.
This hybrid model combines pattern-based detection for known formats with AI-powered verification for unstructured secrets like passwords.

#security #mlp #sre

Read original

Slack EngineeringJun 11, 2026

Agentic Testing: Where Agents Fit in the E2E Testing Stack

Why it matters: Agentic testing shifts E2E focus from rigid journeys to goal-based verification. While too slow and costly for every PR, it provides a powerful exploratory layer that adapts to UI changes and handles complex state transitions where traditional deterministic scripts often fail.

Traditional E2E tests enforce specific UI journeys, while agentic tests focus on achieving high-level goals through adaptive, non-deterministic action sequences.
The study compared three models: Agent + Playwright MCP, Agent + Playwright CLI, and agent-generated deterministic Playwright code.
Agentic workflows are significantly slower (10+ minutes) and costlier ($15–$30/run) than traditional scripts, limiting their use in standard CI/CD.
Structured YAML inputs outperformed natural language for complex workflows by providing explicit mapping between instructions and browser actions.
Agents excel at exploratory testing and self-healing during UI updates, making them ideal for post-deployment verification rather than pre-merge checks.

#frontend #mlp

Read original

GitHub EngineeringJun 10, 2026

Give GitHub Copilot CLI real code intelligence with language servers

Why it matters: Integrating LSP servers into GitHub Copilot CLI replaces fragile text-search heuristics with precise semantic analysis. This enables the AI agent to accurately resolve types and definitions, significantly improving its reliability and effectiveness in complex codebases.

GitHub Copilot CLI traditionally relies on text-search heuristics like grep and binary extraction, which lack semantic precision.
The Language Server Protocol (LSP) provides structured code intelligence, enabling accurate type resolution and cross-referencing.
The 'LSP Setup' skill automates the installation and configuration of language servers for 14 supported languages.
The skill follows a 7-step workflow including OS detection, package manager selection, and configuration verification.
Configuration can be scoped at the user level or repository level, with repository-level settings taking precedence.
Integrating LSPs allows the AI agent to handle complex scenarios like generics, overloads, and compiled bytecode effectively.

#mlp

Read original

Spotify EngineeringJun 10, 2026

Encoding Your Domain Expert: The Context Layer Behind Spotify's Data Assistant

Why it matters: This article highlights how Spotify uses a context layer to bridge the gap between LLMs and complex internal data. It demonstrates a scalable way to encode domain expertise into AI assistants, significantly improving data discovery and reducing the manual burden on human experts.

Spotify developed a Context Layer to provide LLMs with organizational domain knowledge.
The system bridges the gap between raw data assets and user intent through semantic mapping.
It automates the discovery of relevant dashboards and datasets, reducing reliance on human experts.
The architecture focuses on metadata enrichment to improve the accuracy of the AI data assistant.
This approach solves the 'cold start' problem where generic LLMs lack specific internal context.

#data #mlp

Read original

Salesforce EngineeringJun 9, 2026

How to Build Reliable AI Agents: 5 Engineering Patterns from a Production System

Why it matters: Transitioning AI agents from demos to production requires a shift from prompt engineering to system engineering. This article highlights how to handle non-deterministic tasks in critical infrastructure, ensuring agents can safely automate complex cloud optimization worth millions.

AI agents often fail in production because they struggle with non-deterministic environments and scattered sources of truth in infrastructure.
Relying on increasingly complex prompts creates unmaintainable 'software written in English' without improving underlying reliability.
Multi-agent architectures do not automatically solve consistency issues if the model is tasked with problems that require deterministic logic.
Reliability is achieved by engineering the systems around the model rather than focusing solely on model capability or prompt refinement.
The lack of a single source of truth in complex deployment stacks (Terraform, Helm, etc.) prevents agents from reasoning effectively without external guardrails.

#mlp #finops #sre

Read original

GitHub EngineeringJun 9, 2026

From one-off prompts to workflows: How to use custom agents in GitHub Copilot CLI

Why it matters: Custom agents reduce friction by embedding team-specific context and standards directly into the CLI. This allows engineers to automate repetitive tasks with consistent, reviewable, and version-controlled AI workflows, ensuring high-quality outputs across the entire development lifecycle.

GitHub Copilot CLI now supports custom agents defined via Markdown files stored in the .github/agents directory.
Agent profiles use YAML frontmatter to specify roles, expertise, accessible tools, and safety guardrails for consistent execution.
Custom agents enable teams to encode specific standards, such as accessibility or security, into reusable and version-controlled workflows.
These agents provide uniform behavior across different environments, including the terminal, IDE, and GitHub pull requests.
Developers can invoke specialized agents using the /agent slash command within the Copilot CLI to perform complex, context-aware tasks.
The system allows for the automation of repetitive patterns, like log translation or code formatting, without manual prompt engineering.

#mlp #culture

Read original

Page 6 of 38

Prev 1...4 5 6 7 8...38 Next