Curated topic

mlp

Posts tagged with mlp

GitHub EngineeringFeb 27, 2026

From idea to pull request: A practical guide to building with GitHub Copilot CLI

Why it matters: Copilot CLI bridges the gap between terminal workflows and AI assistance. It keeps engineers in their flow state by handling scaffolding, debugging, and mechanical changes without context switching, while ensuring safety through mandatory manual approval of all suggested actions.

GitHub Copilot CLI translates natural language intent into reviewable diffs and commands directly in the terminal.
The /plan mode allows developers to outline architectural steps and project structures before any code is generated.
It automates project scaffolding and boilerplate generation while requiring explicit user approval for all file changes.
Engineers can debug in real-time by asking the CLI to explain test failures and suggest fixes based on terminal output.
The tool handles tedious, repo-wide mechanical changes like refactoring or updating tests with high speed and visibility.
A clear handoff point exists where developers move to an IDE for fine-grained design decisions and API refinement.

#mlp #culture

Read original

GitHub EngineeringFeb 26, 2026

What’s new with GitHub Copilot coding agent

Why it matters: These updates transform AI from a simple autocomplete tool into a sophisticated background agent that handles end-to-end tasks. By automating code review and security checks, it reduces manual toil and ensures higher quality PRs with significantly less human intervention.

GitHub Copilot coding agent now allows users to select specific AI models for tasks, choosing faster models for routine work or more robust ones for complex refactoring.
The agent performs automated self-reviews to catch logic errors and improve code style before submitting a pull request.
Integrated security scanning checks for vulnerabilities, secrets, and dependency issues for free within the agent workflow.
Custom agents can be defined in .github/agents/ to enforce team-specific processes, such as mandatory benchmarking.
Seamless handoff between cloud sessions and the local CLI allows developers to move work between environments without losing context.

#mlp #security

Read original

Dropbox Tech BlogFeb 26, 2026

Using LLMs to amplify human labeling and improve Dash search relevance

Why it matters: Effective RAG systems depend on high-quality search ranking. Using LLMs to scale relevance labeling allows engineers to train more accurate models faster, overcoming the scalability and privacy limitations of traditional human-only labeling workflows.

Dropbox Dash uses a Retrieval-Augmented Generation (RAG) pattern, where search ranking quality is critical for grounding LLM responses.
Ranking models are trained using XGBoost on query-document pairs scored on a 1-5 relevance scale.
While human labeling provides a high-quality gold standard, it is expensive, slow, and poses privacy risks with proprietary data.
LLMs amplify human efforts by generating relevance labels at scale, significantly increasing the volume of training data available.
The hybrid labeling approach combines a small set of human-labeled data with LLM-assisted evaluation to improve model accuracy.
LLM-based labeling effectively addresses the cold-start problem for new search features where user behavior data is sparse.

#mlp #data

Read original

Engineering at MetaFeb 24, 2026

RCCLX: Innovating GPU communications on AMD platforms

Why it matters: RCCLX optimizes GPU communication on AMD platforms, addressing bottlenecks in LLM inference and training. By reducing AllReduce latency and using FP8 quantization, it significantly improves performance for decoding and prefill stages on modern AMD hardware.

Meta open-sourced RCCLX, an enhanced version of the ROCm Communication Collective Library (RCCL) optimized for AMD GPU platforms.
Integrated with Torchcomms, RCCLX includes CTran features like AllToAllvDynamic to enable GPU-resident collectives.
Introduces Direct Data Access (DDA) algorithms that reduce AllReduce latency by 10-50% for LLM decoding and 10-30% for prefill on MI300X GPUs.
DDA flat and tree algorithms optimize small and medium message sizes by allowing ranks to load memory directly from other ranks.
Supports low-precision collectives using FP8 quantization to achieve up to 4:1 compression, significantly reducing communication overhead for large messages.
Leverages AMD Infinity Fabric for high-bandwidth peer-to-peer mesh communication while maintaining numerical stability through FP32 compute steps.

#dist #mlp

Read original

Airbnb EngineeringFeb 24, 2026

Academic Publications & Airbnb Tech: 2025 Year in Review

Why it matters: Airbnb's research demonstrates how to bridge the gap between academic theory and production-scale systems. By using bimodal embeddings and specialized ranking metrics, they solve complex marketplace challenges, providing a blueprint for driving revenue through advanced machine learning.

Airbnb expanded its 2025 research footprint across major conferences including KDD, CIKM, and VLDB, focusing on ML, NLP, and optimization.
Developed rapid pre-A/B assessment techniques using interleaving and counterfactual evaluation to streamline search ranking experiments.
Introduced BiListing embeddings which leverage LLMs and language-image models to unify unstructured text and photo data into ranking signals.
Optimized map-based search by creating a map-specific NDCG metric that better models user attention compared to traditional list-based metrics.
Implemented extreme classification for high-precision audience expansion and location retrieval within a two-sided marketplace.
Enhanced pairwise learning-to-rank algorithms by capturing item interactions during comparisons to more accurately reflect user intent.

#mlp #data

Read original

GitHub EngineeringFeb 24, 2026

Multi-agent workflows often fail. Here’s how to engineer ones that don’t.

Why it matters: As LLMs move from chat to autonomous workflows, reliability depends on rigorous engineering. Applying distributed systems principles like typed contracts and schema enforcement prevents the subtle, cascading failures common in complex multi-agent orchestrations.

Treat multi-agent workflows as distributed systems rather than chat interfaces to manage shared state and non-deterministic behavior.
Implement typed schemas for all inter-agent communication to turn vague debugging into clear contract violation failures.
Use action schemas to constrain agent outputs to a strict, predefined set of valid operations like 'assign' or 'close'.
Adopt the Model Context Protocol (MCP) to enforce input and output validation for tools before execution occurs.
Design for failure by validating every agent boundary, logging intermediate states, and building in retry mechanisms.
Constrain agent actions and interfaces before adding more agents to a workflow to reduce the failure surface.

#dist #mlp

Read original

Netflix Tech BlogFeb 23, 2026

MediaFM: The Multimodal AI Foundation for Media Understanding at Netflix

Why it matters: MediaFM demonstrates how to scale multimodal foundation models for long-form video. By fusing audio, visual, and text signals with temporal context, it enables nuanced content understanding that improves recommendation cold starts, ad placement, and automated asset creation.

MediaFM is a tri-modal Transformer-based encoder that generates contextual embeddings for video shots by fusing audio, video, and text signals.
The model processes sequences of up to 512 shots, utilizing a global context token for title-level metadata to capture long-form narrative dependencies.
Input modalities include SeqCLIP for video frames, wav2vec2 for audio samples, and OpenAI's text-embedding-3-large for subtitles and closed captions.
Training employs a Masked Shot Modeling (MSM) objective, where the model predicts masked fused embeddings by minimizing cosine distance.
Optimization was performed using Muon for hidden parameters and AdamW for others, showing noticeable improvements in model performance.
Evaluation demonstrates that 'embedding in context'—extracting shots within their full episode sequence—significantly outperforms standalone clip embedding.
The foundation model supports diverse downstream applications including ad relevancy classification, clip popularity ranking, and automated tagging.

#mlp #data

Read original

Salesforce EngineeringFeb 21, 2026

From Audio to Action: How Speech Invocable Action Powers Native AI Automation Across Salesforce

Why it matters: This shift to native speech automation eliminates third-party security risks and simplifies complex AI integration. It demonstrates how to build resource-intensive AI features within a multi-tenant environment while maintaining strict data residency and platform stability.

Salesforce developed Speech Invocable Action to provide native, secure speech-to-text and translation within its platform trust boundary.
The architecture manages shared memory and compute resources to ensure stability across concurrent multi-tenant workloads.
Defensive design uses structured error categories, enabling developers to implement explicit fallback logic in Flows and Agentforce.
The team leveraged AI-assisted development tools like Claude Code to navigate complex internal APIs and accelerate production delivery.
Standardizing speech as a composable action removes the need for external integrations and boilerplate code for audio streaming.

#mlp #security #data

Read original

Cloudflare BlogFeb 20, 2026

Code Mode: give agents an entire API in 1,000 tokens

Why it matters: Code Mode solves the context window bottleneck for AI agents by replacing thousands of tool definitions with a programmable interface. This allows agents to interact with massive APIs efficiently and securely, significantly reducing token costs and latency while improving task performance.

Cloudflare's Code Mode reduces AI agent context usage by 99.9%, fitting the entire Cloudflare API into just 1,000 tokens.
It replaces thousands of individual tool definitions with two primary tools: search() for discovery and execute() for action.
Agents generate JavaScript code to interact with a typed SDK, enabling complex multi-step operations in a single round trip.
Execution occurs within a secure Dynamic Worker isolate, a V8 sandbox that prevents prompt injection leaks and unauthorized access.
This pattern allows agents to handle massive APIs that would otherwise exceed the context limits of modern foundation models.
Cloudflare has open-sourced a Code Mode SDK to facilitate this pattern in third-party MCP servers and AI agents.

#mlp #security #dist

Read original

Spotify EngineeringFeb 19, 2026

Our Multi-Agent Architecture for Smarter Advertising

Why it matters: This shift from monolithic AI features to a multi-agent architecture demonstrates how to scale complex ML systems. It provides a blueprint for managing autonomous components that collaborate to solve high-stakes business problems like ad optimization.

Spotify transitioned from simple AI features to a robust multi-agent architecture for their advertising platform.
The architecture addresses structural inefficiencies by delegating specialized tasks to autonomous agents.
This approach enables better decision-making and optimization across complex advertising workflows.
The system focuses on scalability and modularity, allowing for independent agent updates and improvements.
By using multi-agent systems, Spotify can handle high-dimensional data and real-time constraints more effectively.

#mlp #dist #data

Read original

Page 3 of 19

Prev 1 2 3 4 5...19 Next