The use of AI is growing faster than the infrastructure that supports it, and this gap is beginning to impact security, resilience, and access.
A new position paper argues that access to AI should be treated as an intergenerational civil right, rather than a service primarily shaped by market forces. THE study examines what happens when growing demand for AI collides with limited energy, network capacity, and computation, then proposes a new delivery model to avoid worsening inequality. Even as models continue to improve, access to AI results will decrease over time unless the underlying architecture changes.
Demand growth faces physical limits
The paper models how AI inference could evolve as it becomes integrated into everyday applications. It assumes a maximum mobile usage of 60 AI requests per second per user. If around 30% of smartphones support AI features, the authors estimate that mobile inference alone could generate more than 5 trillion queries per minute at peak.
This level of demand places emphasis on two limited resources. The first is networking. Centralized inference points create significant incast traffic, higher latency, and congestion. Upgrading access and core networks to handle this load is expensive and slow.
The second constraint is energy. Previous estimates cited in the article suggest that a single AI search query can consume up to 1,000 times more energy than a traditional search query. Inference latency targets measured in milliseconds per token are pushing vendors toward centralized GPU clusters, which concentrate power consumption in a small number of locations.
The authors argue that these pressures make gatekeeping inevitable. Pricing tiers, usage caps, geographic limitation, and institutional prioritization already exist in limited forms. Over time, these mechanisms become tools for managing scarcity.
Access Limits Create Security and Equity Risks
The study links access restrictions to security and governance issues. When AI systems shape education, recruiting, healthcare, and research, unequal access translates directly into unequal capabilities.
The authors warn that selective access undermines merit-based systems. Users with paid or privileged access gain analytical and creative benefits that are not tied to skills or expertise. This shifts power to those who can afford faster and deeper AI assistance.
Language coverage and infrastructure gaps amplify the problem. AI systems still favor high-resource languages, including English. Regions with weaker connectivity or energy supply are able to receive slower or degraded service even before formal limits appear.
From a regulatory perspective, the paper notes that existing frameworks do not address this coupling between access and resources. EU AI law focuses on risk categories and providers’ obligations, but it does not establish any rights to access AI. American governance is based on voluntary standards and sectoral rules. UNESCO promotes equitable access, but without restrictive mechanisms.
Reframing AI as shared infrastructure
The main proposal is to recognize access to AI as an intergenerational civil right. The intent is to protect current access while preventing resource use that harms future generations.
This framework treats AI as a shared social infrastructure, comparable to public libraries or communications networks. AI systems are trained on publicly produced knowledge, including research, texts, and cultural materials. The authors argue that restricting access to results privatizes the benefits of this shared input.
A decentralized AI distribution network
To operationalize this idea, the authors propose an AI Delivery Network, or AIDN. The model borrows from content delivery networks but adapts them for inference rather than static content.
The basic unit is a fragment of knowledge, represented as key value cache entries from transformer inference. These fragments can be cached, combined, and reused across the network.
Each AIDN node includes three functions. A storage manager holds local knowledge and metadata. A delivery manager pushes and pulls chunks of knowledge based on forecasted demand and local context. A generic inference endpoint handles user requests and performs inference where possible, forwarding requests upstream only when necessary.
Inference is broken down to the edge, into regional micro-data centers, and into cloud infrastructure. Lightweight tasks run close to users. More intensive reasoning is invoked selectively. Energy availability, latency, and congestion guide placement decisions.
The authors estimate that caching and reusing inferences could reduce computational demand by an order of magnitude for common tasks. This also limits the movement of data over long distances, which the paper identifies as a major driver of energy consumption.
Access Limits and Their Effects on Security Operations
Centralized AI services create single points of congestion and policy control. Decentralized inference shifts some risk outward but improves fault tolerance and local control.
The authors position AIDNs as a means to align equity, sustainability, and operational stability. Without such changes, they argue, access to AI will continue to be restricted due to economic and technical pressures rather than explicit policy choices.
Architectural decisions made today will determine whether AI becomes an enduring public capability or a limited resource available to a shrinking group.