Data Collection and Generation for AI Models

As AI-based design workflow is progressively being adopted by the industry, there is a need for companies to implement data management workflows, which is key to building accurate and well-defined AI models.

A perceived limitation in the deployment of AI methodologies in automotive companies is the amount of data available to train the predictive models. While it is true that Machine Learning benefits largely from large datasets, this dataset collection can be sometimes seen as one of the main obstacles by engineering teams to deploy these predictive models at a large scale.

Indeed, classical AI-based models are relying on a very specific parametrization to be trained. Which means that these models cannot be used across topologies or parametrizations, largely limiting the diversity of data that can be collected to train a single model. More specifically, for each new program, a new predictive model must be built from scratch, and the dataset must be re-generated at each new design campaign. This standard, siloed approach, largely restrains the range of suitable applications within the automotive industry.

However, this can be alleviated by a specific class of AI models: 3D Deep Learning based predictive models. These models are using raw, 3D CAD and CAE data, and are completely agnostic to any specific parametrization of designs. This means that the same model can handle and be trained across many geometrical parametrizations and topologies, benefiting from a much bigger pool of data. Ultimately, any simulation result, coming from any CAE team, can be re-used inside these models, making them much more flexible and thus, drastically reducing the effort needed to collect the data.

Traditional Approach vs. Transfer Approach

Let’s take a concrete example, with HVAC development:

HVAC Program 1 - Simulation (left) and CAD Geometry (right)
HVAC Program 2 - Simulation (left) and CAD Geometry (right)

We assume that the Climate Control team has stored design iterations and simulation results for the past 5 HVAC programs. Within both these programs, the team usually performs between 20 and 40 design variations, to match their internal KPIs. In total, the team has approximately ~150 designs and corresponding CFD runs that could be used to train a Machine Learning model. However, these 5 HVAC programs have different CAD parametrizations, and their topologies differ largely, as they correspond to various vehicle types.

With standard, parameter-based Machine Learning approaches, a new surrogate model should be built specifically for a program, and the engineering team would not be able to use this same AI model for the next program. For each program, ~20-40 CFD simulations are available, which would not be enough to train a surrogate model with sufficient accuracy.  

Using 3D Deep Learning, a single model can now be trained across every program, using the ~150 data points available, ensuring a more accurate predictive model. For the next program, this same model can directly be re-used, with 0 -or a few- additional simulations required.

In case you are interested in more details or would like to benefit from 3D Deep Learning in your engineering practices, feel free to get in touch with us.

About the author
Thomas von Tschammer
As Neural Concept Director of Operation, he joined the team in 2018 aiming to empower engineers with next a next generation tool dedicated to CAD and CAE.