Piqama: Pinterest Quota Management Ecosystem

Pinterest EngineeringFebruary 24, 2026

Why it matters

Managing resources at scale requires more than just hard limits. Piqama provides a unified framework for capacity and rate-limiting, enabling automated rightsizing and budget alignment. This reduces manual overhead while improving resource efficiency and system reliability across platforms.

Key takeaways

Piqama is a unified quota management ecosystem at Pinterest handling physical resources, service limits like QPS, and application-specific units.
The platform manages the full quota lifecycle including schema definition, pluggable validation rules, and ownership-based authorization.
It supports both capacity-based quotas for Big Data workloads (integrated with Yunikorn) and rate-limiting for online storage services.
A centralized management portal provides visibility and self-service capabilities for quota updates and usage tracking.
Governance features include automated usage statistics collection via Apache Iceberg and an auto-rightsizing service for predictive resource allocation.
The system integrates with Pinterest's chargeback and entitlement systems to align resource consumption with financial budgets.

Keywords

Quota Management

Authors: Junkai Xue | Sr Staff Software Engineer, Big Data Processing Platform; Zheyu Zha | Staff Software Engineer, Big Data Processing Platform; Jia Zhan | Principal Engineer, Online Systems; Alberto Ordonez Pereira | Sr Staff Software Engineer, Online Systems

Overview

A quota is an official limit on the usage or production of a specific resource. At Pinterest, we are developing a robust, generic quota management platform (Piqama) designed to manage a wide range of resources — including physical resources like memory and CPU, service resources such as QPS (queries per second) and network bandwidth, as well as application-specific quota units. Our ecosystem provides seamless quota lifecycle management, a user-friendly management portal, low-latency quota value broadcasting, quota updates, prediction, and rightsizing capabilities. In this blog, we illustrate how the quota management platform enables both capacity quota management for the Pinterest BigData Platform and rate-limiting quotas for Pinterest Online Services, showcasing its flexibility and impact.

Platform Architecture

Piqama is Pinterest’s Quota Management Ecosystem, created to oversee quotas across diverse systems and quota types, while accommodating multiple platforms and scenarios. Each application either utilizes its own specific quota enforcement logic or leverages the simple, default enforcement mechanisms provided by Piqama.The following section details its architecture:

The Piqama ecosystem provides a comprehensive management portal, accessible via REST and Thrift. It handles the entire quota lifecycle, including updates and usage feedback. After collecting usage statistics, a suite of offline features assists with data governance and efficiency optimization. Further details are available in the following sections.

Generalized Management Portal

A centralized management portal improves the user experience by streamlining quota management across all stages, from upstream to downstream. This portal also minimizes errors by providing user-defined and searchable quota breakdowns, allowing for quick and accurate access to the correct quotas. Below is a UI example illustrating how the quota is visualized:

Quota Lifecycle Management

Piqama is a comprehensive quota management ecosystem designed to handle the entire quota lifecycle. It offers a range of functionalities accessible through its UI portal, REST API, and Thrift client:

Quota Schema Management: Piqama allows for the management of quota schemas, including the definition of unique identifiers and their hierarchical relationships (e.g., workloads within an organization’s project).
Quota Validation: The platform provides a pluggable validation framework. Users can define custom validation rules for both schema and semantic levels, and even integrate with remote services for advanced validation (e.g., ensuring the sum of all quotas does not exceed cluster resource capacity).
Quota Update Authorization: All update operations, including modifications and deletions, require proper authorization based on quota ownership. This leverages owner definitions established during quota creation, where owners can be individuals or groups.
Quota Update Dispatch: While Piqama clients facilitate receiving the latest quota updates, the system is flexible, allowing users to utilize other dispatching mechanisms like Pinterest’s config distribution system (PinConf) or their own custom dispatchers.
Quota Enforcement / Punishment Strategies: Piqama offers default enforcement and punishment strategies that integrate with Piqama clients. These clients can then make real-time decisions on data paths, such as serving or dropping requests based on resource usage against the quota. Applications also have the flexibility to use the quota information for their own decision-making processes.

As a generic quota management platform, Piqama emphasizes customization, enabling different application systems to integrate their specific logic for schema management, validation, dispatching, and enforcement.

Governance & Optimization

In addition to quota management, Piqama also provides post-implementation governance and optimization capabilities.

Piqama clients transparently collect enforcement and usage statistics when applications integrate with them. For applications not using Piqama clients, system-based and storage-based feedback loops are available. A predefined schema and storage format ensure that once applications provide data in the correct format, statistics are stored in Apache Iceberg on Amazon S3. These stored statistics are also pre-aggregated to optimize storage space.

The stored statistical data enables efficient quota auto-rightsizing. Piqama’s framework allows a separate auto-rightsizing service to continuously consume historical data from various sources, including Presto, Iceberg, and user-defined data sources. This service applies rightsizing strategies designed to predict needs based on organic usage growth, traffic bursts, and underutilization detection. Currently, a rightsizing strategy has been developed for capacity-based quotas, aiming to allocate maximum resources without saturating the system for a Big Data Processing Platform within an organization.

Quota vs Budget

Budgeting involves allocating specific dollar amounts to various organizations, teams, or projects. This directly influences quota setup, as quotas represent the resources available based on the allocated budget.

A chargeback system is essential for translating resource usage into real costs, which then draw from the planned budget. Exceeding the budget can lead to penalties in resource allocation. For example, in the Big Data Processing Platform, projects that go over budget may see a reduction of X% in their resources, depending on their tier. In such cases, teams must either secure additional budget or re-prioritize their workloads if they are not critical. Future work will detail the ongoing integration of Piqama with the Pinterest Entitlement system.

Quota in Real World

Pinterest has integrated, or is in the process of integrating, several systems with Piqama. Below are two examples of these integrations, demonstrating how Piqama handles both capacity based (Big Data Processing Platform) and rate limiting based (Online Storage Systems) quota systems.

Capacity Based Quota

Moka, the next-generation massive-scale platform developed for Big Data Processing, utilizes the Apache open-source project Yunikorn as its resource scheduling framework. This framework is responsible for managing resources (such as memory, GPU, and CPU) for batch processing jobs.

Piqama plays a crucial role in managing physical resources like memory and vcore within the Big Data Processing Platform. At its heart, Piqama stores a comprehensive set of quota values for each project. These values are not static but dynamically managed, encompassing:

Guaranteed Resources: The minimum level of memory and vcore that a project is guaranteed to receive, ensuring essential operations can always proceed.
Maximum Resources: The upper limit of memory and vcore that a project can consume, preventing any single project from monopolizing resources and impacting others.
Max Concurrent Applications: A limit on the number of applications a project can run simultaneously, further controlling resource consumption and system load.

Quota values are generated through two methods:

Auto Rightsizing: Due to legacy reasons, default quota values are automatically calculated based on past usage within a sliding window to predict future usage. The Big Data Processing Team is actively developing a budget-based approach for quota value generation.
Manual Adjustments: Recognizing the need for immediate responsiveness, Piqama provides a mechanism for development teams to manually adjust quota values. This flexibility is particularly vital in critical situations such as “firefighting” emergencies or for accommodating urgent, high-priority requests that necessitate immediate resource rebalancing.

Piqama Integration with Big Data Processing Platform

A Yunikorn Config Updater regularly checks Piqama for updated quota values and adjusts the Yunikorn configurations accordingly. Subsequently, each application is submitted and executed within its dedicated Yunikorn queue.

Upon application completion, Yunikorn Application Summary statistics, including resource usage, are recorded in an S3 file. This data is then aggregated into a resource database. This comprehensive resource database serves two critical functions:

Quota Calculation: It provides the foundational data for the automatic calculation of future quota values, enabling continuous refinement and optimization.
Quota Enforcement: It serves as the authoritative source for monitoring real-time resource consumption against allocated budgets.

When a project’s resource usage exceeds its allocated budget within a defined time window, Piqama triggers an enforcement mechanism. The maximum resources available to that project are dynamically lowered. This proactive measure effectively controls the “burning speed” of resources for the over-budget entity, ensuring that available resources are prioritized and allocated to projects that are operating within their defined budgets. This intelligent enforcement mechanism is critical for maintaining overall system health, preventing resource starvation for compliant projects, and fostering a culture of responsible resource consumption across the Pinterest Big Data Processing Platform.

Currently, Piqama completely manages Moka’s quota lifecycle, eliminating the need for manual intervention, though quota adjustments can still be made via the UI for special requirements. The key future enhancement for Moka, particularly for upcoming quota projects, will be an improved auto-rightsizing strategy to optimize resource allocation and utilization.

Rate Limiting Based Quota

Pinterest needs to improve its existing rate limiting framework for online storage services to better handle overload in its multi-tenant environment. This enhancement is crucial for ensuring fair resource allocation among tenants, maintaining system reliability, and controlling costs. The current framework falls short due to several limitations:

Lack of Declarative Rules: The existing rules are not declarative, hindering support for diverse and complex use cases, such as sophisticated queries or specific request properties.
Manual and Error-Prone Adjustments: Modifying rate limits is a manual process, leading to errors and inefficiency.
Static and Non-Adaptive Thresholds: Rate limits are fixed and cannot adjust automatically to fluctuations like organic traffic growth or sudden bursts.

Consequently, the present rate limits often fail to accurately reflect actual resource consumption. This inaccuracy undermines their effectiveness in protecting database servers and makes them unreliable for accurate capacity planning.

As we design our next-generation rate limiting framework, we’d like to streamline the lifecycle management of rate limits, and also treat rate limits as a notion of “quota” that’s linked to the actual system resource usage, for better cost control and budgeting management. This is where Piqama comes into play. Effectively we are leveraging Piqama as the control plane for our rate limiting framework, with the following design principles:

Rate limits lifecycle management should be automated and streamlined.
Rate limit decisions should be made locally in the data path for scalability and performance reasons, with quota management happening in an async fashion.

Piqama Integration with Online Rate Limiting Framework

On a high level:

Rule creation: rate limit rules can be defined by human operators (via UI) or dynamically crafted via online services or automated pipelines (via API calls). These rules are centrally managed by the quota service, allowing for CRUD operations with proper authorization and auditing.
Rule delivery: we leverage Pinterest’s config management platform (Pinconf) to deliver rate limiting rules on the subscribing hosts. This allows us to scale with Pinterest’s config delivery infrastructure, similar to how we manage feature flags and other types of dynamic service configurations.
Rule adjustment: Adhoc rule updates can be done similarly with UI/API, while continuous rate limits management will be done centrally via Piqama right-sizing service, by periodically aggregating the request usage stats, forming the feedback loop.
Rule enforcement: Rate limiting decisions are made locally. Currently this is done by integrating an in-house rate limiting library into the application service. This enables fast rate limiting decisions (in contrast to relying on a global rate limiting service), and also the flexibility to make local decisions based on service health information (e.g. to support graceful rejection based on service capacity).

As of writing this blog, we have successfully completed the initial integration of Piqama with several critical online storage services, including TiDB and Key-Value Stores. We are currently onboarding more use cases and have future plans for dynamic right-sizing and budget integration.

The in-house rate limiting framework extends beyond basic rate limiting, providing capabilities for general throttling and concurrency control. We call it the Service-Protection Framework (SPF). We’ll defer the details of SPF in a future blog post.

Learnings and Future

The recent state of the union revealed significant product momentum within this ecosystem, driven by a few core interests:

Unified Portal Access: Providing a single interface for managing quotas across all services.
Integrated Quota, Entitlement, and Budgeting: Aligning quota management with entitlement and budget concepts to streamline governance.
Fine-tuned Auto-Rightsizing: Enabling more efficient and effective quota utilization through intelligent automation.

As we continue to enhance support, we anticipate a growing number of users will leverage Piqama for various high-impact scenarios, including:

PinCompute: Pinterest’s general-purpose compute platform.
ML Training Platform: Supporting machine learning workloads at scale.
LLM Serving Services: Powering large language model inference and deployment.

Future Roadmap

Looking forward, upcoming enhancements for Piqama will focus on several strategic areas:

Entitlement Integration: Establishing a strong link between resource quotas and the entitlement system to streamline and strengthen budget allocation.
Advanced Auto-Rightsizing: Rolling out customized auto-rightsizing capabilities to optimize resource usage — minimizing required quota while ensuring all systems remain performant.
Distributed Quota Management: Introducing advanced features for managing quotas across distributed instances to better support complex environments.
Unified Client Experience: Launching a simplified, one-stop client for seamless quota integration across services.

These investments will empower teams to manage resources more efficiently, driving both operational efficiency and innovation across the platform.

Acknowledgements

DPI: Thanks Hengzhe Guo, Enzo Reyes, Rainie Li for helping in Piqama design / development.
Online Storage Systems: Thanks Alex Sloan, Hobin Yoon for integrating Online Storage Systems with Piqama and enhancing the Piqama.
Thanks to Soam Acharya, Ambud Sharma, Vibhav Grag, Prashant Patel, Nan Zhu, Qin Chen, Hunter Gatewood, Hao Fu, Jiajun Wang, Jinru He, Mirjam Wattenhofer for thoughtful discussions and reviews.
Leadership: Thanks Ang Zhang, Bo Liu, Chunyan Wang, Roger Wang for continuous support.
Special thanks to Kartik Paramasivam for his insightful guidance in unifying our quota systems and aligning us in the right direction.

Piqama: Pinterest Quota Management Ecosystem was originally published in Pinterest Engineering Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.