Managing resources at scale requires more than just hard limits. Piqama provides a unified framework for capacity and rate-limiting, enabling automated rightsizing and budget alignment. This reduces manual overhead while improving resource efficiency and system reliability across platforms.
Authors: Junkai Xue | Sr Staff Software Engineer, Big Data Processing Platform; Zheyu Zha | Staff Software Engineer, Big Data Processing Platform; Jia Zhan | Principal Engineer, Online Systems; Alberto Ordonez Pereira | Sr Staff Software Engineer, Online Systems
A quota is an official limit on the usage or production of a specific resource. At Pinterest, we are developing a robust, generic quota management platform (Piqama) designed to manage a wide range of resources — including physical resources like memory and CPU, service resources such as QPS (queries per second) and network bandwidth, as well as application-specific quota units. Our ecosystem provides seamless quota lifecycle management, a user-friendly management portal, low-latency quota value broadcasting, quota updates, prediction, and rightsizing capabilities. In this blog, we illustrate how the quota management platform enables both capacity quota management for the Pinterest BigData Platform and rate-limiting quotas for Pinterest Online Services, showcasing its flexibility and impact.
Piqama is Pinterest’s Quota Management Ecosystem, created to oversee quotas across diverse systems and quota types, while accommodating multiple platforms and scenarios. Each application either utilizes its own specific quota enforcement logic or leverages the simple, default enforcement mechanisms provided by Piqama.The following section details its architecture:

The Piqama ecosystem provides a comprehensive management portal, accessible via REST and Thrift. It handles the entire quota lifecycle, including updates and usage feedback. After collecting usage statistics, a suite of offline features assists with data governance and efficiency optimization. Further details are available in the following sections.
A centralized management portal improves the user experience by streamlining quota management across all stages, from upstream to downstream. This portal also minimizes errors by providing user-defined and searchable quota breakdowns, allowing for quick and accurate access to the correct quotas. Below is a UI example illustrating how the quota is visualized:

Piqama is a comprehensive quota management ecosystem designed to handle the entire quota lifecycle. It offers a range of functionalities accessible through its UI portal, REST API, and Thrift client:
As a generic quota management platform, Piqama emphasizes customization, enabling different application systems to integrate their specific logic for schema management, validation, dispatching, and enforcement.
In addition to quota management, Piqama also provides post-implementation governance and optimization capabilities.
Piqama clients transparently collect enforcement and usage statistics when applications integrate with them. For applications not using Piqama clients, system-based and storage-based feedback loops are available. A predefined schema and storage format ensure that once applications provide data in the correct format, statistics are stored in Apache Iceberg on Amazon S3. These stored statistics are also pre-aggregated to optimize storage space.
The stored statistical data enables efficient quota auto-rightsizing. Piqama’s framework allows a separate auto-rightsizing service to continuously consume historical data from various sources, including Presto, Iceberg, and user-defined data sources. This service applies rightsizing strategies designed to predict needs based on organic usage growth, traffic bursts, and underutilization detection. Currently, a rightsizing strategy has been developed for capacity-based quotas, aiming to allocate maximum resources without saturating the system for a Big Data Processing Platform within an organization.
Budgeting involves allocating specific dollar amounts to various organizations, teams, or projects. This directly influences quota setup, as quotas represent the resources available based on the allocated budget.
A chargeback system is essential for translating resource usage into real costs, which then draw from the planned budget. Exceeding the budget can lead to penalties in resource allocation. For example, in the Big Data Processing Platform, projects that go over budget may see a reduction of X% in their resources, depending on their tier. In such cases, teams must either secure additional budget or re-prioritize their workloads if they are not critical. Future work will detail the ongoing integration of Piqama with the Pinterest Entitlement system.
Pinterest has integrated, or is in the process of integrating, several systems with Piqama. Below are two examples of these integrations, demonstrating how Piqama handles both capacity based (Big Data Processing Platform) and rate limiting based (Online Storage Systems) quota systems.
Moka, the next-generation massive-scale platform developed for Big Data Processing, utilizes the Apache open-source project Yunikorn as its resource scheduling framework. This framework is responsible for managing resources (such as memory, GPU, and CPU) for batch processing jobs.
Piqama plays a crucial role in managing physical resources like memory and vcore within the Big Data Processing Platform. At its heart, Piqama stores a comprehensive set of quota values for each project. These values are not static but dynamically managed, encompassing:
Quota values are generated through two methods:

A Yunikorn Config Updater regularly checks Piqama for updated quota values and adjusts the Yunikorn configurations accordingly. Subsequently, each application is submitted and executed within its dedicated Yunikorn queue.
Upon application completion, Yunikorn Application Summary statistics, including resource usage, are recorded in an S3 file. This data is then aggregated into a resource database. This comprehensive resource database serves two critical functions:
When a project’s resource usage exceeds its allocated budget within a defined time window, Piqama triggers an enforcement mechanism. The maximum resources available to that project are dynamically lowered. This proactive measure effectively controls the “burning speed” of resources for the over-budget entity, ensuring that available resources are prioritized and allocated to projects that are operating within their defined budgets. This intelligent enforcement mechanism is critical for maintaining overall system health, preventing resource starvation for compliant projects, and fostering a culture of responsible resource consumption across the Pinterest Big Data Processing Platform.
Currently, Piqama completely manages Moka’s quota lifecycle, eliminating the need for manual intervention, though quota adjustments can still be made via the UI for special requirements. The key future enhancement for Moka, particularly for upcoming quota projects, will be an improved auto-rightsizing strategy to optimize resource allocation and utilization.
Pinterest needs to improve its existing rate limiting framework for online storage services to better handle overload in its multi-tenant environment. This enhancement is crucial for ensuring fair resource allocation among tenants, maintaining system reliability, and controlling costs. The current framework falls short due to several limitations:
Consequently, the present rate limits often fail to accurately reflect actual resource consumption. This inaccuracy undermines their effectiveness in protecting database servers and makes them unreliable for accurate capacity planning.
As we design our next-generation rate limiting framework, we’d like to streamline the lifecycle management of rate limits, and also treat rate limits as a notion of “quota” that’s linked to the actual system resource usage, for better cost control and budgeting management. This is where Piqama comes into play. Effectively we are leveraging Piqama as the control plane for our rate limiting framework, with the following design principles:

On a high level:
As of writing this blog, we have successfully completed the initial integration of Piqama with several critical online storage services, including TiDB and Key-Value Stores. We are currently onboarding more use cases and have future plans for dynamic right-sizing and budget integration.
The in-house rate limiting framework extends beyond basic rate limiting, providing capabilities for general throttling and concurrency control. We call it the Service-Protection Framework (SPF). We’ll defer the details of SPF in a future blog post.
The recent state of the union revealed significant product momentum within this ecosystem, driven by a few core interests:
As we continue to enhance support, we anticipate a growing number of users will leverage Piqama for various high-impact scenarios, including:
Looking forward, upcoming enhancements for Piqama will focus on several strategic areas:
These investments will empower teams to manage resources more efficiently, driving both operational efficiency and innovation across the platform.
Piqama: Pinterest Quota Management Ecosystem was originally published in Pinterest Engineering Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.
Continue reading on the original blog to support the author
Read full articleOOM errors are a primary cause of Spark job failures at scale. Pinterest's elastic executor sizing allows jobs to be tuned for average usage while automatically handling memory-intensive tasks, significantly reducing manual tuning effort, job failures, and infrastructure costs.
Transitioning from batch to real-time ingestion is critical for modern data-driven apps. Pinterest's architecture shows how to use CDC and Iceberg to reduce latency from days to minutes while cutting costs and ensuring compliance through efficient row-level updates and unified pipelines.
This article demonstrates how to overcome legacy observability challenges by pragmatically integrating AI agents and context engineering, offering a blueprint for unifying fragmented data without costly overhauls.
This article details Pinterest's approach to building a scalable data processing platform on EKS, covering deployment and critical logging infrastructure. It offers insights into managing large-scale data systems and ensuring observability in cloud-native environments.