COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation
ECCV 2024



Context-Sensitive City Generation Given any Priors

overview

Our method (Green) pursues realistic context harmonization among neighboring city blocks as real data (bottom-middle). Other methods (e.g. LayoutDM, GlobalMapper) show over-diversity/-similarity in city-scale layout generation (evaluated by Context Score CTS). Fully random and identical layouts are synthetically generated to illustrate extreme cases.


Abstract

The generation of large-scale urban layouts has garnered substantial interest across various disciplines. Prior methods have utilized procedural generation requiring manual rule coding or deep learning needing abundant data. However, prior approaches have not considered the context-sensitive nature of urban layout generation. Our approach addresses this gap by leveraging a canonical graph representation for the entire city, which facilitates scalability and captures the multi-layer semantics inherent in urban layouts. We introduce a novel graph-based masked autoencoder (GMAE) for city-scale urban layout generation. The method encodes attributed buildings, city blocks, communities and cities into a unified graph structure, enabling self-supervised masked training for graph autoencoder. Additionally, we employ scheduled iterative sampling for 2.5D layout generation, prioritizing the generation of important city blocks and buildings. Our approach achieves good realism, semantic consistency, and correctness across the heterogeneous urban styles in 330 US cities. Codes and datasets are released at



Method


1. Canonical Graph Representation via Vector Quantization

overview

Our method represents a city as a canonical graph G. Each node bi represent a single city block, and each edge eij connects spatially adjacent blocks. Each block/node corresponds to a set of node features si and a quantized vector qi hierarchically capturing enclosed building layouts. Edge feature dij encodes distances between block centroids. Graph G is used for GMAE training.

2. Graph-based Masked Autoencoder

overview

We propose a novel Graph-based Masked AutoEncoder (GMAE) for large-scale multi-layer 2.5D urban layout generation. Given a canonical city graph, quantized building layout features Q are masked with dynamic masking ratios ranging [0.5, 1.0], while block shape and location features S are kept. The GNN encoder uses message passing between neighboring nodes to obtain the context-aware node features F. The decoder uses F to reconstruct Q'. The predicted Q' are decoded to 2.5D urban layouts.



3. Priority-based Scheduled Generation


overview

We iteratively utilize pretrained GMAE to reconstruct masked node features. In each iteration, we accept a certain ratio of predicted nodes decided by the scheduling function f(t) = 1-cos(t/T). We obtain a full graph after T iterations.



Results: City-Scale Layout Generation

overview
overview

Given the same road network (except SDXL), all above methods generate urban layouts in only one pass without post-processing or human-in-the-loop refinement. The "even rows" are a zoom-in of the highlighted areas in the "odd rows". Our method generates a realistic distribution of urban layouts with plausible context-dependent behaviors (as indicated by CTS score).



BibTeX


                    @article{he2024coho,
                        title={COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation},
                        author={He, Liu and Aliaga, Daniel},
                        journal={arXiv preprint arXiv:2407.11294},
                        year={2024}
                        }
                  

The website template was borrowed from Michaƫl Gharbi.