data

Posts tagged with data

Why it matters: This system provides real-time, statistically robust insights into content safety, enabling platforms to proactively identify and mitigate harms. It's crucial for maintaining user trust and scaling content moderation efficiently with AI.

  • Pinterest developed an AI-assisted system to measure "prevalence" of policy-violating content, focusing on the percentage of total views.
  • This system addresses the shortcomings of report-only metrics, which often miss under-reported harms and lack statistical power.
  • It utilizes ML-assisted sampling from daily user impressions, leveraging production risk scores for efficiency while ensuring unbiased prevalence estimates.
  • A multimodal LLM (vision + text) enables bulk labeling of sampled content, significantly reducing latency and cost compared to human review.
  • Inverse-probability weighting ensures unbiased, design-consistent prevalence metrics, decoupling measurement from enforcement model thresholds.
  • Continuous calibration, human validation, and periodic checks against SME-labeled gold sets maintain LLM accuracy and detect model drift.
  • The system provides daily, statistically powered insights for faster interventions and effective content safety tracking.

Why it matters: This article demonstrates a practical approach to de-biasing recommendation systems by integrating direct user feedback via surveys into ML model training. Engineers can learn how to move beyond pure engagement metrics to build more user-centric and high-quality content platforms.

  • Pinterest implemented in-app Pinner surveys to gather direct user feedback on content visual quality, moving beyond traditional engagement metrics.
  • The survey design collected at least 10 ratings per image for 5k Pins across diverse interest verticals, averaging scores to ensure data reliability and reduce subjectivity.
  • A machine learning model was trained using this aggregated survey data, mapping image embedding features to a single score (0-1) indicating perceived visual quality.
  • This ML model is integrated into Pinterest's core recommendation systems, including Homefeed, Related Pins, and Search, to promote higher quality content.
  • The approach aims to de-bias recommendation systems, prevent the promotion of low-quality "clickbait," and align content delivery with user well-being and satisfaction.

Why it matters: This article demonstrates how Pinterest achieves high-performance AI at significantly lower costs by prioritizing open-source models and fine-tuning with domain-specific data. It's crucial for engineers seeking efficient, scalable, and cost-effective AI development strategies.

  • Pinterest is strategically shifting AI investments towards fine-tuned open-source models, achieving similar quality at less than 10% the cost of proprietary solutions.
  • The competitive edge in AI is moving from large general-purpose LLMs to domain-specific data, personalization, and deep product integration.
  • Pinterest develops user recommendation systems and visual foundation models in-house, leveraging unique, large-scale datasets.
  • For text-based LLMs, Pinterest utilizes a mix of open-source and third-party proprietary models.
  • Open-source multimodal LLMs are enabling differentiation through fine-tuning with proprietary data and end-to-end optimization.
  • The Pinterest Assistant exemplifies this, using an agentic multimodal LLM to route tasks to specialized, Pinterest-native tools, prioritizing tool quality.

Why it matters: This article demonstrates how to overcome legacy observability challenges by pragmatically integrating AI agents and context engineering, offering a blueprint for unifying fragmented data without costly overhauls.

  • Pinterest faced fragmented observability data (logs, traces, metrics) due to legacy infrastructure predating OpenTelemetry, hindering efficient root-cause analysis.
  • They adopted a pragmatic solution using AI agents and a Model Context Protocol (MCP) server to unify disparate observability signals without a full infrastructure overhaul.
  • The MCP server allows AI agents to interact simultaneously with various data pillars (metrics, logs, traces, change events) to find correlations and build hypotheses.
  • This "context engineering" approach aims to provide intelligent agents with comprehensive data, leading to faster, clearer root-cause analysis and actionable insights.
  • The initiative represents a "shift-left" (proactive integration) and "shift-right" (production visibility) strategy, leveraging AI to overcome existing observability limitations.

Why it matters: This article highlights how Azure Local provides engineers with flexible, sovereign, and resilient cloud capabilities on-premises or at the edge. It enables deploying AI and critical workloads while meeting strict compliance and operational autonomy requirements, even in disconnected environments.

  • Azure Local extends Azure public cloud infrastructure to customer datacenters and distributed locations, ensuring control, resilience, and operational autonomy for mission-critical workloads.
  • It addresses data sovereignty and compliance needs, enabling AI, scalable compute, and advanced analytics to run locally or at the edge.
  • Key advancements include General Availability for Microsoft 365 Local, NVIDIA RTX GPUs for on-premises AI, and Azure Migrate support.
  • Preview features like AD-less deployments, Rack-Aware Clustering, multi-rack deployments, and fully disconnected operations enhance flexibility and autonomy.
  • Leveraging Azure Arc, Azure Local provides a unified platform for hybrid and disconnected environments, supporting diverse industries like manufacturing and public sector.
  • Integration with Azure IoT and Microsoft Fabric facilitates intelligent physical operations and real-time insights from operational data.

Why it matters: This article demonstrates how to scale agentic AI in complex enterprise environments by balancing LLM reasoning with deterministic logic. It provides a blueprint for reducing latency and ensuring architectural consistency across multi-brand deployments while maintaining high accuracy.

  • Restructured architecture by offloading deterministic tasks like JSON parsing and hierarchical decisioning from the LLM to Apex code to ensure consistency.
  • Reduced multi-stage reasoning latency by approximately 20 seconds by consolidating sequential model calls into a single execution step.
  • Optimized data retrieval by combining Data 360 lookups and order API calls into single, efficient pulls rather than incremental passes.
  • Developed a multi-brand architecture using a shared core logic layer while allowing brand-specific prompt overrides for unique tone and voice.
  • Improved response times by 3–5x through the elimination of redundant reasoning loops and the stabilization of data-flow boundaries.

Why it matters: Replicate's acquisition by Cloudflare signifies a major step towards building a comprehensive, integrated AI infrastructure. It promises to simplify the deployment and scaling of complex AI applications by combining model serving with a global network and full-stack primitives.

  • Replicate, founded in 2019, aimed to democratize access to research-grade ML models by abstracting away infrastructure complexities.
  • They developed Cog for model packaging and the Replicate platform for running models as cloud API endpoints, successfully scaling with models like Stable Diffusion.
  • The modern AI stack has evolved beyond just model inference, requiring a full suite of services like microservices, storage, and databases.
  • Replicate is joining Cloudflare to leverage Cloudflare's extensive network, Workers, R2, and other primitives to build a complete, integrated AI infrastructure layer.
  • This acquisition will enable faster edge models, model pipelines on Workers, and streaming model I/O, realizing a vision where "the network is the computer" for AI.

Why it matters: This article highlights Python's enduring appeal, its foundational design principles emphasizing readability and accessibility, and its continued dominance in AI and data science, offering insights into language evolution and developer preferences.

  • Python, created by Guido van Rossum, emerged to simplify programming by offering a safer, more expressive alternative to C and shell scripting.
  • Despite TypeScript's recent lead on GitHub, Python grew 49% in 2025, maintaining its status as the default language for AI, science, and education.
  • Its core design emphasizes readability, intuitive syntax, friendly error messages, and a rich standard library, fostering accessibility.
  • Python's open-source nature, cross-platform support, and strong community are key to its versatility and widespread adoption.
  • The language's "irreverent" name reflects a deliberate choice to make programming less intimidating and more welcoming.

Why it matters: This article details advanced techniques in training AI for developer tools, showcasing how custom data collection, SFT, and RL overcome challenges in real-time code prediction. It's crucial for engineers building AI-powered developer experiences and understanding practical LLM deployment.

  • GitHub Copilot's Next Edit Suggestions (NES) uses a custom, low-latency model designed to predict developers' next code edits in real time.
  • Initial attempts with general LLMs and pull request data failed; a custom, high-quality dataset derived from real-time editing sessions was crucial for training.
  • The foundational NES model was developed using Supervised Fine-Tuning (SFT) on this specialized dataset.
  • Reinforcement Learning (RL), incorporating a custom 'grader' model, further refined the NES model, addressing SFT limitations by leveraging unlabeled data and explicitly defining criteria for 'bad' suggestions.
  • This 'AI-native' approach emphasizes end-to-end co-design of model training, prompt engineering, and user experience for seamless IDE integration.
  • Recent improvements focus on prompt optimization to reduce latency and enhance the relevance and quality of suggestions.

Why it matters: Engineers can leverage Ax, an open-source ML-driven platform, to efficiently optimize complex systems like AI models and infrastructure. It streamlines experimentation, reduces resource costs, and provides deep insights into system behavior, accelerating development and deployment.

  • Ax 1.0 is an open-source adaptive experimentation platform leveraging machine learning for efficient optimization of complex systems.
  • It's widely used at Meta to improve AI models, tune production infrastructure, and accelerate advances in ML and hardware design.
  • The platform employs Bayesian optimization to guide resource-intensive experiments, identifying optimal configurations efficiently.
  • Ax provides advanced analytical tools, including Pareto frontiers and sensitivity analysis, for deeper system understanding beyond just finding optimal settings.
  • An accompanying paper details Ax's core architecture, methodology, and performance comparison against other black-box optimization libraries.
Page 4 of 9