Optimization is a fundamental process in many scientific and engineering applications. Optimizing a function comprises searching its domain for an input that results in the minimum or maximum value of the given objective. In the case where we have access to an analytical expression of the objective function, the answer to the optimization problem might be straightforward. Even when it is infeasible to directly obtain the analytical solution, gradient-based method combined with heuristic search would bring satisfaction in most cases. But what if we don’t have direct access to the objective function or the gradients? In many engineering applications, the form of the objective function can be very complex and intractable to analyze. Its evaluation might incur the resolution of a complex PDE, which makes it computationally very expensive.
In such cases, we usually refer to the objective function as a black-box function that when given queries for some input locations can answer back with the true objective value we are interested in. But in engineering design problems, querying the function can be very expensive! Function evaluation time can be in the order of minutes to hours to even days. One would therefore want to be highly selective in choosing the input locations to query. One of the techniques widely used to help is Bayesian optimization. It works by constructing, or learning, a probabilistic model of the objective function, called the surrogate model, that is then searched efficiently with an acquisition function before candidate samples are chosen for evaluation on the real objective function.
The video embedded in this article, provides a glimpse of how the concepts covered in this article are nicely put together and are easily accessible to the engineer through a simple user interface. Though treating a rather synthetic example, the video provides a first glance of the online learning loop automation possible using the NCS software.
Global optimization in a nutshell
In summary, given a black-box function, an efficient search strategy of its input space to find the optimum boils down to answering the two following questions:
- What surrogate model should I use to substitute the true objective function?
- Given a specific surrogate model, what is the exploration strategy I should use to explore the different regions of the input and select next points to query?
The surrogate model we are after is a model that we can use to make reliable predictions about the latent function but also one that can maintain a measure of uncertainty over these predictions. That’s why Gaussian Processes (GPs) have been largely used to answer the modeling question. GP-based surrogate models provide a nice and flexible mechanism for learning continuous functions by interpolating observations. The confidence intervals GPs provide can also be used to assess if one should consider refitting the predictions in some regions of interest. GPs are very general and enjoy a neat mathematical foundation. They, however, have the shortcoming of being bound to a rather small number of parameters depicting the input space. In other words, GPs lose their efficiency in high dimensional spaces when the number of features exceeds a few dozens.
As for the exploration question, when given a relatively low dimensional input space, both grid and random search strategies can do. But for most realistic optimization problems, the dimensionality of the input space is so high that a grid search becomes quickly intractable and the random search may not be the optimal option to adopt. A combination of evolutionary algorithms with a heuristic local search is often used to efficiently search the input space and acquire new samples to evaluate.
The non-convexity of the objective is also another challenge faced in optimization problems. When constructing the surrogate, one should be careful about premature convergence and the possibility of being stuck in a local rather than the global optimum.
Most real world optimization applications are formulated as multi-objective optimization problems where we seek to simultaneously optimize for multiple criteria and take into account the different constraints imposed. In that case, instead of finding an ultimate optimum, the goal of the optimization is to recover the Pareto front of these different objectives or, in certain cases, to identify Pareto optimal points only in a subset of the Pareto front.
Another equally important aspect to consider when tackling a global optimization problem is the possibility to exploit multi-fidelity evaluations. When the number of high-fidelity evaluations we dispose is limited, like it is often the case for very complex functions, we can kick-start the surrogate model training with lower fidelity evaluations to first explore which regions of the search space to further query while sparing the higher fidelity evaluations to refitting the regions where high accuracy predictions are more important. Constructing the surrogate model using multi-fidelity evaluations provides a nice trade-off between prediction accuracy and computational efficiency. Furthermore, by blending evaluations coming from different sources, it contributes to a more interoperable approach to the optimization process. Learning the surrogate model from multi-fidelity evaluations goes hand in hand with uncertainty quantification. The lower fidelity an evaluation is, the wider the confidence interval around the prediction and vice versa. In the context of active learning, these confidence intervals, or variance
maps, constitute an essential part in planning further samples acquisition.
Numerical Simulations and 3D shape Optimization
Multi-objective optimization has a multitude of applications in the realm of numerical simulations. 3D shape design optimization is a particularly interesting domain for such applications. The examples here are numerous from the optimization of the aero- or hydrodynamics characteristics of a certain design through computational fluid dynamics (CFD) to ensuring proper solid material rigidity through structural analysis.
If you are a mechanical engineer working on the design of a blade for a propeller or maybe perfecting the profile of an airplane wing, you’ll almost certainly always refer to some Computational Fluid Dynamics (CFD) simulations as a cheaper proxy to solving the Navier Stokes equations. You might very well be a medical engineer carefully calibrating the design of a cardiovascular pump, CFD simulations would also be among your most important tools to ensure optimal fluid flow through your design. In these two scenarios, and numerous other ones, CFD simulations prove very important especially when considering the reduced need for physical tests they can yield. When it comes to computational complexity, however, CFD simulations are among the most expensive function evaluations. A full blown high-fidelity simulation may very well last for a few days... As an engineer with a strong problem-solving mindset, would you always wait for every single simulation result? Most probably you wouldn’t. You’d rather build a surrogate!
A very similar process applies to other engineering disciplines such as mechanical and electromagnetic engineering. In mechanical structural analysis for example, the Finite Element (FE) solvers are widely used to predict the design performance characteristics such as part distortions due internal stresses. FE solvers are based on the Finite Element Method that provides an approximation of the solution to solid material analysis by discretizing the continuous body of the design input space into a large finite number of elements and solving the problem in the domain of each element along with the corresponding boundary conditions. The smaller the element size is, the more accurate, but expensive, the simulation results are.
The above examples can only represent a negligible fraction of the applications of surrogate modeling for shape optimization. Engineers are clearly not oblivious to the great potentials of surrogate models in accelerating 3D simulations and optimizing their designs. Yet, with the limitations imposed by the parametrization requirements of most current GP-based surrogate models, it is still rather difficult to harness the full benefits of surrogate models in fully automated optimization loops.
What about Deep-Learning based surrogates?
More recently, a new family of surrogate models are gaining traction in the circles of scientific engineering. By training a deep neural network to learn the surrogate model, one can not only overcome the low-dimensionality limitation of Gaussian-based surrogates but in some cases also obtain a superior prediction accuracy . The Deep Learning approach is also more interoperable. Dropping the handcrafted shape parameterization requirement means that we can train the network on arbitrary shapes exploiting the capabilities of transfer learning and leveraging on geometries pertaining to different datasets that were otherwise locked in project silos with a distinct regressor trained exclusively on each set.
Not very surprisingly though, this new technology comes with its own set of challenges. At first sight, 3D Convolutional Neural Networks might manifest themselves as the right candidate for the task. However, their large memory footprint makes it difficult to accommodate all the data required for their training into memory -even when considering modern GPUs capacities. A possible mitigation, one that compromises accuracy, is training the network on relatively coarse discretization, aka voxels, of the volumetric input. A yet better alternative is possible by utilizing a newer architecture of CNN, Geodesic Convolutional Neural Network (GCNN) in particular. GCNNs learn directly from the surface mesh representation of 3D geometries and can thus considerably reduce the computing requirements of plain CNN. Besides the enhanced usability and interoperability they enjoy, GCNN-based surrogate models offer another important advantage over Gaussian Process and other parametrized forms of surrogates in the context of 3D shape optimization. By training the network directly on the surface mesh of the shape, it is possible to back-propagate the gradient calculations down to the original vertices, enabling a free form first order optimization of the shape. This can be very useful when embedded in a multi-objective optimization process. Though, one still has to integrate the proper constraints to guarantee smoothness and to preserve other design requirements.
The above mentioned Geodesic Convolutional Neural Networks are at the core architecture of surrogate models provided by Neural Concept Shape, the first of its kind software interface dedicated to 3D Deep Learning shape optimization.
This article presented a very brief and high-level overview of multi-objective global function optimization and the benefits one can unlock utilizing deep learning approaches in constructing the surrogate model used in the optimization process. The ideas presented here are inspired by the optimization framework used in the Neural Concept Shape software. Some interesting use cases can be found browsing our website. Further reading materials and related articles are also listed below.
- Deep Fluids: A Generative Network for Parameterized Fluid Simulations. https://arxiv.org/pdf/1806.02071.pdf
- Deep learning for mechanical property evaluation. http://news.mit.edu/2020/deep-learning-mechanical-property-metallic-0316
- Learning to Simulate Complex Physics with Graph Networks. https://arxiv.org/abs/2002.09405
- Geodesic Convolutional Shape Optimization. https://arxiv.org/abs/1802.04016