HouseDiffusion.pdf - 墨天轮文档

HouseDiffusion.pdf

落枫0907

164

10页

0次

2023-09-24

100墨值下载

HouseDiffusion: Vector Floorplan Generation via a Diffusion Model

with Discrete and Continuous Denoising

Mohammad Amin Shabani, Sepidehsadat Hosseini, Yasutaka Furukawa

Simon Fraser University

{mshabani, sepidh, furukawa}@sfu.ca

Time Input Graph

0 1020302501000

Figure 1. Given a bubble diagram as the input constraint, HouseDiffusion directly generates a vector ﬂoorplan by initializing the room/door

coordinates with Gaussian noise and iteratively denoising them. Qualitative and quantitative evaluations demonstrate that HouseDiffusion

signiﬁcantly outperforms the current state-of-the-art with large margins.

Abstract

The paper presents a novel approach for vector-

ﬂoorplan generation via a diffusion model, which denoises

2D coordinates of room/door corners with two inference ob-

jectives: 1) a single-step noise as the continuous quantity to

precisely invert the continuous forward process; and 2) the

ﬁnal 2D coordinate as the discrete quantity to establish ge-

ometric incident relationships such as parallelism, orthog-

onality, and corner-sharing. Our task is graph-conditioned

ﬂoorplan generation, a common workﬂow in ﬂoorplan de-

sign. We represent a ﬂoorplan as 1D polygonal loops,

each of which corresponds to a room or a door. Our dif-

fusion model employs a Transformer architecture at the

core, which controls the attention masks based on the in-

put graph-constraint and directly generates vector-graphics

ﬂoorplans via a discrete and continuous denoising pro-

cess. We have evaluated our approach on RPLAN dataset.

The proposed approach makes signiﬁcant improvements in

all the metrics against the state-of-the-art with signiﬁcant

margins, while being capable of generating non-Manhattan

structures and controlling the exact number of corners per

room. A project website with supplementary video and doc-

ument is here https://aminshabani.github.io/housediffusion.

1. Introduction

Automated ﬂoorplan generation made tremendous

progress in the last few years. While not being fully au-

tonomous yet, the state-of-the-art techniques help architects

to explore the space of possible designs quickly [33, 34].

90% of buildings do not have dedicated architects for ﬂoor-

plan design due to their cost in North America. This tech-

nology will make the work of professional architects afford-

able to more house buyers.

Despite recent progress, state-of-the-art ﬂoorplan gener-

ative models produce samples that are incompatible with

the input constraint, lack in variations, or do not look like

ﬂoorplans [34]. The issue is the raster geometry analysis

via convolutions, where a room is represented as a binary

image. The raster analysis is good at local shape reﬁnement

but lacks in global reasoning, and requires non-trivial post-

processing for vectorization [8, 34]. On the other hand, di-

rect generation of vector ﬂoorplans is not trivial either. Dif-

ferent from the generation of images or natural languages,

structured geometry exhibit precise incident relationships

among architectural components (e.g., doors and rooms).

arXiv:2211.13287v1 [cs.CV] 23 Nov 2022

For example, a wall is usually axis-aligned, where the coor-

dinate values of adjacent corners are exactly equal. A wall

might be shared with adjacent rooms further. Direct regres-

sion of 2D coordinates would never achieve these relation-

ships. One could use a discrete representation such as one

hot encoding over possible coordinate values with classiﬁ-

cation, but this causes a label imbalance (i.e., most values

are 0 in the encoding) and fails the network training.

This paper presents a novel approach for graph-

constrained ﬂoorplan generation that directly gener-

ates a vector-graphics ﬂoorplan (i.e., without any post-

processing), handles non-Manhattan architectures, and

makes signiﬁcant improvements on all the metrics. Con-

cretely, a bubble-diagram is given as a graph, whose nodes

are rooms and edges are the door-connections. We represent

a ﬂoorplan as a set of 1D polygonal loops, each of which

corresponds to a room or a door, then generate 2D coordi-

nates of room/door corners (See Fig. 1). The key idea is the

use of a Diffusion Model (DM) with a careful design in the

denoising targets. Our approach infers 1) a single-step noise

amount as a continuous quantity to precisely invert the con-

tinuous forward process; and 2) the ﬁnal 2D coordinate as

the discrete quantity to establish incident relationships. The

discrete representation after the denoising iterations is the

ﬁnal ﬂoorplan model.

Qualitative and quantitative evaluations show that the

proposed system outperforms the existing state-of-the-art,

House-GAN++ [34], with signiﬁcant margins, while being

end-to-end and capable of generating non-Manhattan ﬂoor-

plans with exact control on the number of corners per room.

We will share all our code and models.

2. Related Work

Floorplan generation: Generation of 3D buildings and

ﬂoorplans has been an active area of research from a pre-

deep learning era [4, 12, 31, 32, 37]. The research area

has further ﬂourished with the emergence of deep learning.

Nauata et al. [33] proposed House-GAN as a graph con-

strained ﬂoorplan generative model via Generative Adver-

sarial Network [11]. House-GAN generates segmentation

masks of different rooms and combines them to a single

ﬂoorplan. The authors further improved the quality of the

generation by House-GAN++ [34], which iteratively reﬁnes

a layout. Given the boundary of a ﬂoorplan, Upadhyay et

al. [44] used the embedded input boundary as an additional

input feature to predict a ﬂoorplan. Hu et al. [17] proposed

Graph2Plan that retrieves a graph layout from a dataset and

generates room bounding boxes as well as a ﬂoorplan in an

ad-hoc way. Sun et al. [42] proposed to iteratively gener-

ate connectivity graphs of rooms and a ﬂoorplan semantic

segmentation mask. Given a set of room types and their

area sizes as the constraint, Luo and Huang [29] proposed a

vector generator and a raster discriminator to train a GAN

model using differential rendering. Although their method

generates vector ﬂoorplans directly, it is limited to rectan-

gular shapes. Along with the adjacency graph as the in-

put, Yin et al. [5] use graph-theoretic and linear optimiza-

tion techniques to generate ﬂoorplans. Our paper also tack-

les a graph-constrained ﬂoorplan generation with a bubble

diagram as the constraint [34]. The key difference is that

HouseDiffusion processes a vector geometry representation

from start to ﬁnish, and hence, directly generating vector

ﬂoorplan samples.

Diffusion models: Deep generative models have seen great

success in broader domains [11,24,36,38,46], where a Dif-

fusion Model (DM) [7, 41, 49] is an emerging technique.

Ho et al. [14] used a DM to boost image generation quality.

Dhariwal and Nichol [35] made improvements by propos-

ing a new noise schedule and learning the variances of the

reverse process. The same authors made further improve-

ments by novel architecture and classiﬁer guidance [10].

DMs have been adapted to many other tasks such as Natu-

ral Language Processing [23], Image Captioning [9], Time-

Series Forecasting [43], Text-to-Speech [20, 22], and ﬁ-

nally Text-to-Image as seen in the great success of DALL-E

2 [39] and Imagen [40].

Molecular Conformation Generation [16, 18, 28, 48] and

3D shape generation [26, 27, 30, 50] are probably the clos-

est to our task. What makes our task unique and challeng-

ing is the precise geometric incident relationships, such as

parallelism, orthogonality, and corner-sharing among dif-

ferent components, which continuous coordinate regression

would never achieve. In this regard, several works use dis-

crete state space [3, 6, 15] or learn an embedding of discrete

data [9,23] in the DM formulation. However, we found that

these pure discrete representations do not train well, prob-

ably because the diffusion process is continuous in nature.

In contrast, our formulation simultaneously infers a single-

step noise as the continuous quantity and the ﬁnal 2D co-

ordinate as the discrete quantity, achieving superior gener-

ation capabilities (See Sect. 5.3 for more analysis). To our

knowledge, our work is the ﬁrst in using DMs to generate

structured geometry.

3. Preliminary

Diffusion models (DMs) denoise a Gaussian noise x

towards a data sample x

in T steps, whose training con-

sists of the forward and the reverse processes. The forward

process takes a data sample x

and generates a noisy sample

at time step t by sampling a Gaussian noise ϵ ∼ N(0, I):

√

(1 − γ

)ϵ. (1)

is a noise schedule that gradually changes from 1 to

0. The reverse process starts from a pure Gaussian noise

of 10

100墨值下载

关注

评论