Curated topic

data

Posts tagged with data

Microsoft Azure BlogJan 15, 2026

Chart your AI and agent strategy with Microsoft Marketplace

Why it matters: Engineers must balance speed-to-market with customizability. This ecosystem simplifies the 'build vs. buy' decision by providing pre-vetted models and agents that integrate with existing stacks while ensuring governance and cost optimization through cloud consumption commitments.

Microsoft Marketplace provides a central catalog of over 11,000 AI models and 4,000 apps to support build, buy, or hybrid AI strategies.
Pro-code developers can access foundational models from Anthropic, Meta, and OpenAI via Azure Foundry to maintain full control over custom logic and IP.
Low-code development is enabled through Microsoft Copilot Studio, allowing teams to build agents grounded in organizational data with minimal coding.
Ready-made agents and multi-agent systems can be deployed directly into Microsoft 365 Copilot to accelerate time-to-value for common business use cases.
Governance tools like Private Azure Marketplace allow IT teams to curate approved solutions and maintain oversight of AI deployments.
Marketplace transactions can be applied toward Microsoft Azure Consumption Commitment (MACC), helping organizations optimize cloud spend and procurement.

#mlp #finops #data

Read original

Cloudflare BlogJan 15, 2026

Human Native is joining Cloudflare

Why it matters: This acquisition signals a shift from chaotic web scraping to structured, licensed data for AI. For engineers, it introduces new patterns like pub/sub content indexing and machine-to-machine payments (x402), moving away from inefficient crawling toward a sustainable, automated web economy.

Cloudflare has acquired Human Native, a UK-based marketplace that transforms unstructured multimedia content into high-quality, licensed AI training data.
The acquisition aims to address the strain on the internet's economic model caused by skyrocketing crawl-to-referral ratios from AI bots.
Cloudflare is developing an 'AI Index' using a pub/sub model, allowing websites to push structured updates to developers in real time instead of relying on blind crawling.
The integration supports Cloudflare's existing tools like AI Crawl Control and Pay Per Crawl, giving content owners granular control over bot access.
Cloudflare is partnering with Coinbase on the x402 Foundation to establish protocols for machine-to-machine transactions and digital resource payments.

#data #mlp #dist

Read original

Engineering at MetaJan 14, 2026

Adapting the Facebook Reels RecSys AI Model Based on User Feedback

Why it matters: Traditional engagement metrics like watch time don't always reflect true user interest. By integrating direct survey feedback into ranking models, engineers can reduce noise, improve long-term retention, and better align content with niche user preferences in large-scale recommendation systems.

Facebook Reels transitioned from relying solely on engagement metrics like watch time to integrating direct user feedback via the User True Interest Survey (UTIS) model.
The UTIS model acts as a lightweight alignment layer trained on binarized survey responses to predict user satisfaction and content relevance.
Research indicated that traditional interest heuristics only achieved 48.3% precision, highlighting the gap between engagement signals and true user interest.
The system addresses sampling and nonresponse bias by weighting survey data to ensure the training set accurately reflects the broader user base.
Integrating survey-based interest matching led to significant improvements in long-term user retention, engagement, and satisfaction across video surfaces.

#mlp #data

Read original

Microsoft Azure BlogJan 14, 2026

Microsoft named a Leader in IDC MarketScape for Unified AI Governance Platforms

Why it matters: As AI adoption scales, engineers need unified tools to manage model lifecycles, security, and compliance. Microsoft’s integrated approach reduces operational risk and simplifies the deployment of responsible, agentic AI systems across complex multicloud environments.

Microsoft recognized as a Leader in the 2025-2026 IDC MarketScape for Unified AI Governance Platforms.
Microsoft Foundry serves as the developer control plane for model development, evaluation, deployment, and monitoring.
Microsoft Agent 365 provides a centralized IT control plane for managing and securing agentic AI across the enterprise.
Integrated security features include real-time jailbreak detection, agent identity management via Entra, and AI-specific threat protection in Defender.
Automated compliance tools in Microsoft Purview support over 100 regulatory frameworks for hybrid and multicloud environments.

#mlp #security #data

Read original

PlanetScale Tech BlogJan 14, 2026

Database Transactions

Why it matters: Understanding transaction internals like MVCC and undo logs is crucial for optimizing database performance, managing concurrency, and ensuring data integrity. It helps engineers choose between Postgres and MySQL based on their specific storage and maintenance needs.

Transactions ensure atomicity by grouping multiple SQL operations into a single unit that either fully succeeds via commit or fails via rollback.
Postgres implements consistent reads through multi-versioning, using xmin and xmax metadata to track row visibility across concurrent sessions.
MySQL achieves isolation by overwriting rows immediately while maintaining an undo log to reconstruct previous versions for other transactions.
Postgres requires periodic maintenance via VACUUM to reclaim storage space from obsolete row versions created during updates.
Consistent reads allow transactions to maintain an isolated view of data, preventing interference from simultaneous external modifications.

#data #sre

Read original

Salesforce EngineeringJan 13, 2026

How Agentforce Enabled Conversational Recommendations with AI-Driven Intent on Data 360

Why it matters: Engineers must evolve recommendation engines from passive click-based tracking to active intent extraction. This shift enables autonomous agents to provide contextually relevant responses in real-time, solving the cold-start problem and handling unstructured data at enterprise scale.

Developed 'Understand User Intent' Agentforce action to transform unstructured conversational history into structured JSON intent signals using LLMs.
Re-architected personalization systems to prioritize real-time conversational intent over long-term behavioral history for higher relevance.
Implemented semantic catalog modeling to solve cold-start problems where historical engagement data is missing.
Integrated intent signals into existing Data 360 real-time ingestion pipelines to maintain low latency during agentic interactions.
Bridged language gaps by mapping user-specific terminology to standardized catalog metadata across multiple languages.

#data #mlp #dist

Read original

Pinterest EngineeringJan 13, 2026

PinLanding: Turn Billions of Products into Instant Shopping Collections with Multimodal AI

Why it matters: It demonstrates how to scale multimodal LLMs for production by combining expensive VLM extraction with efficient dual-encoder retrieval. This architecture allows platforms to organize billions of items into searchable collections while maintaining high precision and low operational costs.

PinLanding is a production pipeline that transforms massive product catalogs into structured shopping collections using multimodal AI.
The system uses Vision-Language Models (VLMs) to extract normalized key-value attributes from product images and metadata.
A curation layer employs LLM-as-judge and embedding-based clustering to consolidate sparse attributes into a searchable vocabulary.
To scale, Pinterest uses a CLIP-style dual-encoder model to map products and attributes into a shared embedding space for efficient assignment.
The infrastructure leverages Ray for distributed batch inference, allowing independent scaling of CPU-bound preprocessing and GPU-bound model execution.
The pipeline processes billions of items in approximately 12 hours on 8 NVIDIA A100 GPUs, costing roughly $500 per run.

#mlp #dist #data

Read original

Cloudflare BlogJan 13, 2026

What we know about Iran’s Internet shutdown

Why it matters: Understanding how nation-states manipulate BGP and IP announcements to enforce shutdowns is crucial for engineers building resilient, global systems. It highlights the vulnerability of centralized network infrastructure and the importance of monitoring tools like Cloudflare Radar.

Iran implemented a near-total internet shutdown starting January 8, 2026, following widespread civil protests.
Cloudflare Radar observed a 98.5% drop in announced IPv6 address space, signaling a deliberate disruption of routing paths.
Overall traffic volume plummeted by 90% within a 30-minute window as major ISPs like MCCI, IranCell, and TCI went offline.
By 18:45 UTC on January 8, internet traffic from the country reached effectively zero, indicating a complete disconnection from the global web.
Brief spikes in DNS traffic (1.1.1.1) and university network connectivity were observed on January 9 before being shut down again.

#data #security #dist

Read original

Microsoft Azure BlogJan 11, 2026

Bridging the gap between AI and medicine: Claude in Microsoft Foundry advances capabilities for healthcare and life sciences customers

Why it matters: This integration enables engineers to build specialized AI agents for highly regulated sectors. By combining Claude's reasoning with domain-specific MCPs and Azure's secure infrastructure, teams can automate complex medical reasoning and R&D tasks while maintaining strict compliance.

Anthropic and Microsoft launched Claude for Healthcare and Life Sciences on Microsoft Foundry, offering domain-specific AI agents for complex medical workflows.
The platform utilizes Model Context Protocols (MCPs) and specialized connectors to integrate Claude with scientific databases and clinical systems.
Healthcare features automate administrative tasks like prior authorization and claims appeals using advanced reasoning and evidence synthesis.
Life sciences capabilities support bioinformatics, experimental protocol design, and molecular design via code interpreter workflows.
The solution is built on Azure’s HIPAA-ready infrastructure, ensuring enterprise-grade security and biosafety guardrails for regulated environments.

#mlp #security #data

Read original

Spotify EngineeringJan 7, 2026

Why We Use Separate Tech Stacks for Personalization and Experimentation

Why it matters: Separating these stacks allows engineering teams to optimize for specific performance and reliability needs. It reduces architectural complexity, ensuring that ML-driven personalization doesn't compromise the statistical validity of A/B testing frameworks.

Spotify maintains distinct technical stacks for personalization and experimentation to address their unique operational requirements.
Personalization systems are optimized for low-latency model inference and high-throughput content delivery.
Experimentation infrastructure focuses on statistical validity, randomized assignment, and unbiased metric analysis.
Decoupling these domains prevents architectural complexity and avoids the pitfalls of a monolithic 'one-size-fits-all' solution.
Independent stacks allow teams to scale infrastructure based on specific data lifecycles and performance bottlenecks.

#mlp #data #dist

Read original

Page 7 of 19

Prev 1...5 6 7 8 9...19 Next