Curated topic

mlp

Posts tagged with mlp

GitHub EngineeringMay 12, 2026

Dungeons & Desktops: Building a procedurally generated roguelike with GitHub Copilot CLI

Why it matters: This project showcases how AI agents and CLI tools can accelerate experimental development. It highlights a novel way to use repository metadata for procedural generation while demonstrating a shift toward intent-based programming where AI handles implementation details.

Developed GitHub Dungeons, a Go-based CLI extension that transforms code repositories into playable roguelike games.
Implemented procedural generation using Binary Space Partitioning (BSP) seeded by the repository's latest commit SHA for consistent map layouts.
Utilized GitHub Copilot CLI's /delegate command to asynchronously handle complex tasks like game balancing and feature implementation via pull requests.
Created a specialized 'dungeon scribe' AI agent to automatically generate technical documentation and ASCII art diagrams.
Demonstrated a workflow shift where the developer focuses on high-level game design and behavior while AI manages syntax and boilerplate code.

#mlp #culture

Read original

Pinterest EngineeringMay 8, 2026

Enhancing Ad Relevance: Integrating Real-Time Context into Sequential Recommender Models

Why it matters: This approach solves the 'cold start' of session intent in recommendation systems by blending offline historical sequences with real-time context. The hybrid inference model balances computational efficiency with immediate relevance, significantly improving candidate survival in ranking funnels.

Developed a Contextual Sequential Two Tower Model to integrate real-time user context with historical offsite behavior.
Integrated a context layer into the query tower that concatenates Transformer-encoded sequences with real-time subject Pin features.
Utilized synthetic augmented data during training by injecting pseudo-context from positive labels to teach the model contextual relevance.
Implemented a hybrid inference flow where historical sequences are computed offline and context layers are processed online at request time.
Achieved a 3x to 10x improvement in Recall@K and a 275-300% increase in candidate relevance for Related Pins.
Resulted in a 0.7% lift in Return on Ad Spend (ROAS) by increasing the survival rate of candidates in the ranking funnel.

#mlp #data

Read original

GitHub EngineeringMay 7, 2026

Improving token efficiency in GitHub Agentic Workflows

Why it matters: Optimizing agentic workflows is critical for managing CI/CD costs. By moving data retrieval out of the LLM reasoning loop and pruning unused tool schemas, engineers can significantly reduce token consumption and latency without sacrificing agent performance.

Implemented an API proxy to capture normalized token usage data across different agent frameworks including Claude and Copilot CLI.
Deployed automated Auditor and Optimizer workflows to identify usage anomalies and propose specific code-level optimizations.
Reduced context overhead by pruning unused Model Context Protocol (MCP) tool registrations, saving 8-12 KB of schema per call.
Shifted data-fetching operations from LLM tool calls to deterministic GitHub CLI commands to minimize reasoning steps and round-trips.
Developed an Effective Tokens (ET) metric to normalize costs across different models and account for prompt caching benefits.

#finops #mlp #sre

Read original

Cloudflare BlogMay 7, 2026

Building for the future

Why it matters: Cloudflare's massive restructuring signals a shift in how tech giants view workforce composition in the age of AI agents. It highlights the transition from traditional engineering roles to AI-augmented workflows, setting a precedent for industry-wide organizational changes.

Cloudflare is reducing its global workforce by over 1,100 employees to restructure for the agentic AI era.
The company reports a 600% increase in internal AI usage over the last three months, with employees running thousands of AI agent sessions daily.
This shift involves reimagining all internal processes, roles, and organizational structures to prioritize AI-driven workflows.
Departing employees receive significant severance, including base pay through 2026 and accelerated equity vesting.
The move aims to maintain high growth and innovation by moving away from legacy organizational structures that no longer suit an AI-native company.

#culture #mlp

Read original

GitHub EngineeringMay 7, 2026

Agent pull requests are everywhere. Here’s how to review them.

Why it matters: As AI agents exponentially increase code volume, engineers face a critical review gap. Identifying specific failure modes like CI gaming and redundancy is essential to prevent long-term technical debt and maintain system integrity in an automated development lifecycle.

Agent-generated code often introduces hidden technical debt and redundancy despite appearing clean and passing initial CI checks.
Reviewers must watch for 'CI gaming,' where agents weaken test coverage, skip linting, or modify workflows to force a passing state.
Agents frequently lack global repository context, leading to 'code reuse blindness' and the duplication of existing utility functions.
Logical errors like off-by-one mistakes or missing permission checks can result in 'hallucinated correctness' that compiles but fails in production.
Large, unscoped agent PRs are prone to 'agentic ghosting,' where the tool fails to address complex feedback or enters circular logic loops.
Human reviewers should focus on tracing critical paths and enforcing code consolidation rather than simply scanning diffs for style.

#culture #security #mlp

Read original

Salesforce EngineeringMay 7, 2026

How Informatica Built a Multi-Agent AI System to Reduce Data Workflows from Months to Days

Why it matters: This article demonstrates how multi-agent architectures solve the limitations of single-agent AI in complex enterprise environments. By decomposing workflows into specialized agents, engineers can achieve higher accuracy, better context management, and faster execution for data-heavy tasks.

CLAIRE is a multi-agent AI system integrated into Informatica's IDMC to automate complex data workflows across discovery, governance, and quality.
The architecture uses an orchestration agent as a control plane for intent detection, plan generation, and routing to specialized agents.
Specialized agents handle specific domains like data quality and profiling, which reduces context load and improves tool selection accuracy.
A planning layer allows users to review and modify high-level execution plans before the system executes workflows involving 50-60 model calls.
The system transitioned from a single-agent model to specialized agents to overcome context limits, latency, and inconsistent tool invocation.
This multi-agent approach reduced enterprise data workflow completion times from three months to just a few days with a 90% task success rate.

#data #mlp

Read original

GitHub EngineeringMay 6, 2026

Validating agentic behavior when “correct” isn’t deterministic

Why it matters: As AI agents move to autonomous 'computer use,' traditional testing causes brittle pipelines. Engineers need validation frameworks that handle non-determinism to ensure agents are reliable without halting production due to incidental environmental noise.

Traditional deterministic testing fails for AI agents because they use non-linear, multi-path execution to achieve goals.
Environmental noise like network lag often triggers false negatives in rigid assertion-based or record-and-replay testing frameworks.
Engineers should shift to a "Trust Layer" that focuses on essential milestones and logical outcomes rather than specific step sequences.
Agent behavior is categorized into essential states, optional variations, and convergent paths to distinguish success from incidental noise.
This framework allows for more robust CI/CD pipelines when deploying agentic systems in production environments.

#mlp #sre

Read original

Netflix Tech BlogMay 4, 2026

Democratizing Machine Learning at Netflix: Building the Model Lifecycle Graph

Why it matters: As ML scales, infrastructure silos prevent collaboration and lineage tracking. Netflix’s Model Lifecycle Graph solves this by unifying heterogeneous metadata into a queryable graph, enabling engineers to discover assets, track dependencies, and understand model impact across the enterprise.

Netflix addressed ML fragmentation by building a Metadata Service (MDS) that creates a unified Model Lifecycle Graph across diverse business domains.
The system solves the 'black box' problem where models, features, and pipelines were siloed in domain-specific infrastructure.
A standardized addressing scheme using AIP URIs (aip://type/platform/resource) enables global uniqueness and cross-service referencing.
The architecture decouples abstract domains (like Models or Pipelines) from concrete providers, allowing for easy integration of new backend systems.
MDS enables critical discovery capabilities, such as identifying which A/B tests are using specific models or tracking feature lineage to source data.
The graph-based approach facilitates cross-pollination, allowing teams like Ads or Personalization to reuse sophisticated embeddings created by Studio teams.

#mlp #data #dist

Read original

GitHub EngineeringMay 4, 2026

Why it matters: As AI evolves from simple prompts to autonomous agents, engineers need frameworks that handle state and orchestration. OpenClaw provides the infrastructure to build reliable, long-running agentic workflows, moving AI from experimental demos to production-ready systems.

OpenClaw is an open-source framework designed for building and running agentic AI systems with over 350,000 stars.
The framework focuses on tool orchestration, state management, and handling long-running workflows for production environments.
GitHub is hosting the OpenClaw: After Hours event on June 3, 2026, at its San Francisco headquarters during Microsoft Build.
The event features a fireside chat with creator Peter Steinberger and panels with maintainers on shipping real-world agentic systems.
Technical discussions will cover the practical challenges of moving beyond simple prompt demos to autonomous systems that execute tasks.

#mlp #dist #culture

Read original

Salesforce EngineeringMay 4, 2026

Scaling AI-Driven Conversations from 10K to 100K While Maintaining Real-Time Consistency

Why it matters: Scaling real-time conversational data is critical for AI agents requiring immediate context. This architecture shows how to balance high-throughput ingestion with low-latency retrieval, ensuring consistency in distributed systems even under extreme traffic spikes.

Evolved CSS from a Postgres-based transactional system to a horizontally scaled NoSQL architecture to mitigate hotspots and handle bursty traffic.
Implemented Kafka with conversation-level partitioning to stabilize ingestion through buffering and batching, targeting 100,000 concurrent interactions.
Introduced VegaCache to resolve read-after-write consistency issues caused by asynchronous processing delays in the streaming pipeline.
Optimized data handling for AI-driven workloads by using compression for large payloads and pagination for long conversation threads.
Transitioned toward a curated Kafka model where the conversation stream serves as an ordered, reliable source of truth for downstream AI pipelines.

#dist #data #mlp

Read original

Page 10 of 38

Prev 1...8 9 10 11 12...38 Next