By 2050, two out of every three people are expected to live in urban or metropolitan areas. This could add as many as 2.5 billion more people to cities worldwide. Such urban growth puts significant strain on infrastructure, transportation, and public services. Increased population density, climate risks, aging infrastructure, and growing traffic and environmental problems have created an urgent need for new ways to manage city systems.
In response, physical AI, which combines artificial intelligence with sensors, video analytics, digital twins, and edge computing, is becoming a key part of the vision for future smart cities.

At the heart of NVIDIA’s smart-city vision is the “Blueprint for Smart City AI,” a unified framework that brings together digital twins, synthetic data generation, AI agents for video analytics, and vision-language models (VLMs). This blueprint enables cities to simulate real-world conditions, gather and interpret vast streams of sensor data, and deploy large-scale real-time vision AI systems.
Key updates include foundation models from NVIDIA Cosmos, VLMs designed for photorealistic synthetic data, and upgrades to the VSS (Video Search and Summarization) blueprint within the Metropolis platform. By providing complete workflows and “cookbooks” (step-by-step guides) for traffic systems, sensor integration, and digital twin operations, NVIDIA’s program seeks to accelerate the implementation of AI in cities by addressing technical challenges.
At the Smart City Expo World Congress (SCEWC) in Barcelona, NVIDIA showcases five partner companies demonstrating how physical AI is being used in real-world urban settings.
Esri collaborates with NVIDIA to create AI agents that process and analyze large amounts of camera sensor data. They visualize the results on interactive geospatial maps. In Raleigh, North Carolina, this project allows city operators to respond quickly, notify relevant departments, optimize traffic flows, and enhance infrastructure design.
Milestone Systems is adding generative AI capabilities to its XProtect video-management platform. They use VLMs trained on 75,000 hours of traffic video to provide alerts, automate report creation, and summarize video context. These improvements could cut operator alarm fatigue by up to 30% by filtering out false alarms.

Linker Vision, a leader in applying the NVIDIA blueprint, is rolling out physical AI in cities such as Ho Chi Minh City and Da Nang. Building on their success in Kaohsiung, Taiwan, where incident response times decreased by up to 80%, they use 3D digital twins through Omniverse to simulate and monitor traffic and construction activities efficiently.
In Dublin, Ireland, the Smart Dublin initiative collaborates with Bentley Systems, which uses its 3D geospatial platform, Cesium, and VivaCity, which provides AI-powered sensors. They monitor micromobility options such as walking, cycling, and scooters, as well as motor vehicle traffic. By visualizing this data in real time through a digital twin, the city can identify dangerous areas and enhance traffic flow.
Deloitte is using AI to automate street inspections across thousands of crosswalks and intersections. The goal is to protect vulnerable road users, such as pedestrians and cyclists. By using Cosmos Predict to transform static images into realistic videos and Cosmos Transfer to create varied scenarios such as fog, rain, and low light, the system helps assess intersections under different conditions.
These projects highlight three key points. First, combining sensor networks with vision AI allows immediate responses on a citywide scale. Second, digital twins and synthetic data workflows reduce reliance on real-world data alone and enable simulations of rare or dangerous situations. Third, the ecosystem approach, which brings together hardware, software, cloud, and edge partners and system integrators, is crucial for addressing the size and complexity of smart-city initiatives.

While the potential of physical AI in cities is exciting, scaling up presents significant challenges. Issues include integrating legacy infrastructure, managing large volumes of diverse data, ensuring privacy and compliance, and achieving real-time performance at the edge. The blueprint aims to tackle these problems by providing frameworks, models, and workflows. As more cities like Dublin, Ho Chi Minh City, and Raleigh start pilot projects, the outcomes will show how quickly physical AI can become common.
In conclusion, NVIDIA’s approach for smart cities shifts the focus from minor improvements to a larger vision of physical AI on an urban scale. By combining simulations, real-world sensing, digital twins, and advanced AI models, the company and its partners are preparing cities to manage mobility, safety, climate resilience, and infrastructure better in the future. As urbanization accelerates, these innovations are set to play an essential role in how cities evolve and function.