A modern construction site is a busy and complicated place. A newcomer trying to understand everything that is going on can easily fall prey to sensory overload. Even industry veterans struggle to track progress in detail or confidently predict cost and schedule overruns. Doxel helps our customers make sense of their sites by applying techniques from computer vision, machine learning, and artificial intelligence to combine data from many sources into a unified picture of reality. LIDAR point clouds, 360 images, 3D meshes, work schedules, industry know-how — all these modalities of data are fused together in our data pipeline to create actionable insights into the construction process.
This post provides a high-level overview of our system to set the stage for deeper dives into our product and technology over the next few months.
Every site is unique in some way, but at Doxel we are able to draw on patterns we have observed across all of our customers, as well as the cumulative decades of construction expertise of our staff. We have compiled this knowledge into our Construction Encyclopedia, a detailed hierarchical ontology that categorizes and relates every object we scan. The Encyclopedia goes beyond a mere catalog of parts — it encodes the key distinctions that we need in order to predict schedule and cost, as well as the metadata that powers our AI solutions. The architectural systems our customers are building are complex, multi-stage projects that cannot be understood correctly without context and experience. The Construction Encyclopedia gives Doxel the ability to automatically map these systems onto the correct solutions efficiently at scale.
When Doxel onboards a new customer our systems automatically categorize the common components, and then our team of human experts work to finalize a complete digital twin of the site. This is a collaborative process where our experienced BIM Engineers work with the customer to bring their model up to our high standards of accuracy and completeness. A great deal of contextual knowledge and construction industry experience is required to fully understand the relative importance of each element on the site and the implications of particular delays or mis-installations. By encoding this information upfront in our Construction Encyclopedia, our Model Onboarding team sets up the rest of our data pipeline for success.
Scan Registration & Alignment
Automation is crucial to our work, because our Data Capture Coordinators are bringing in enormous quantities of complex, multimodal data from each of our sites every week. We deploy state-of-the-art LIDAR, 360 Camera, and drone-based site scanners to collect a full picture of the progress of our sites from every angle. Meanwhile, our customers are not standing still — their work schedules and architectural diagrams are adapting and evolving to reflect what is happening on the ground. Correlating all of this incoming information is a significant challenge, but our automated data pipeline is up to the task.
As scans and model updates are uploaded, our proprietary computer vision algorithms automatically register and align all of these data streams into a single fused representation of the ground truth. Through a rigorous, data-driven process of characterization and comparison, our Data Capture engineers have selected the best scanning devices for each phase of construction and optimized the scanning parameters for maximum accuracy. The result is a seamless flow of site scans that reliably achieves an alignment accuracy within one inch over the entire site, week after week. This fast, accurate, and automatic preprocessing is indispensable for providing reliable inputs to all of our site analytics.
Machine Learning Annotation
Given this clean and contextualized data as input, the ML team at Doxel is able to provide automated site understanding with state-of-the-art models that compare the expected and observed state of objects on the site. In the ideal case, comparing a mesh to a point cloud would be a fairly straightforward problem of geometry. In the real world we need to deal with a wide variety of complications, such as occlusion, mis-installation, or inaccurate mesh geometries. We encounter a broad and ever-growing range of objects and scenarios across the diverse sites that we service, so we strive to develop models with features that encode a high level geometric understanding that generalizes across our problem domain and systems that are flexible enough to cover the edge cases.
We employ a wide range of models of different complexity, ranging from simple heuristics to Graph Neural Networks (GNN) specially architected for comparing mesh and point cloud data. GNNs are particularly well suited for handling the sparse and irregular nature of point cloud data. Our GNN learns local features that relate points to their neighbors in the scanned point cloud, and global features describing the broader shape of the cloud. These features are represented by layers that act on the graph of nearby points in an analogous manner to convolutional operators acting on pixels in images. Other models project the cloud and mesh into 2D along various axes, allowing us to leverage the many methods of deep learning on images to better model our 3D data.
Expert Human Verification
No machine learning system can match the accuracy of domain experts carefully reviewing every decision, and the same is true in the construction space. That is why Doxel feeds its machine learning annotations to construction experts through home-built advanced tooling. In the beginning, we mimicked our customers and used off-the-shelf tools to verify our ground truth. We found that determining site progress by simultaneously navigating BIM models, roughly localized panos, huge dense point clouds, and disjointed scheduling software was incredibly clumsy! That’s why we built a “digital surveyor command center” interface that integrates the machine learning proposals, BIM model, site schematics, 2D and 3D data captures, and the weekly pull plan schedule. Our integrated approach sped up verification by several orders of magnitude while simultaneously increasing accuracy. In the rare event that our ground truth was incorrect our system takes the expert’s annotations back into Doxel’s training pipeline to improve our future accuracy.
All of these things are inputs to the engine that delivers key insights to Owners and General Contractors on some of the world’s largest construction projects.