Self-optimization of chemical reactions using machine learning multi-objective algorithms has the potential to significantly shorten overall process development time, providing users with valuable information about economic and environmental factors. Using the Thompson Sampling Efficient Multi-Objective (TS-EMO) algorithm, the self-optimization flow chemistry system in this report demonstrates the ability to identify optimum reaction conditions and trade-offs (Pareto fronts) between conflicting optimization objectives, such as yield, cost, space-time yield, and E-factor, in a data efficient manner. Advantageously, the robust system consists of exclusively commercially available equipment and a user-friendly MATLAB graphical user interface, and was shown to autonomously run 131 experiments over 69 hours uninterrupted.
A user-friendly MATLAB graphical user interface was utilized in combination with a robust machine learning self-optimization flow chemistry system to identify optimum reaction conditions and trade-offs (Pareto fronts) between optimization objectives, such as yield, cost, space-time yield, and E-factor, in a data efficient manner.
Despite the prevalence of established techniques such as Design of Experiments (DoE), reaction optimization is still often a difficult and time-consuming task for chemists.1 Identifying where improvements can be made is challenging due to the large number of process variables with many different possible combinations that should be tested. This issue can be alleviated using self-optimizing systems that combine programmable chemical handlers, a machine-learning reaction optimization algorithm, and online analytical techniques in a real-time adaptive feedback optimization loop (Figure 1). Examples of analytical methods suitable for self-optimising experimental systems include gas chromatography (GC), high-performance liquid chromatography (HPLC), mass spectrometry (MS), in-situ infrared spectroscopy (IR) and nuclear magnetic resonance (NMR) spectroscopy.2 A significant advantage of these types of systems is that the optimization procedure can be entirely automated, where no user intervention is required.
Figure 1
General flow chart of a reaction self-optimization system.
Reaction optimization conducted by chemists is typically measured against multiple performance criteria such as yield, cost, impurities profile, and environmental impacts. Therefore, the ability for the automated process to self-optimize for multiple objectives is highly desirable. The majority of existing self-optimizing systems utilize single-objective optimization algorithms, such as the Nelder-Mead simplex (NMSIM) and Stable Noisy Optimization by Branch and FIT (SNOBFIT).3–5 Owing to the significantly increased complexity of multiple objective optimization, there are few algorithms that have been demonstrated to efficiently perform this task. Whilst multiple objectives can be scalarized into a single function, the weighting given to individual objectives is subjective when compared to multi-objective optimization.
Another key point for multi-objective algorithms is that objectives sometimes compete with one another (e. g. yield vs. cost), which makes it is impossible to find a single set of ‘utopian’ conditions that correspond with optimal values for both objectives. One representation of competing multi-objective optimization is a Pareto front (Figure 2),6 which is a set of non-dominated data points where either objective cannot be improved without having a detrimental effect on the other, i. e. showing the trade-off between objectives. An example of an algorithm for efficient multi-objective reaction optimization is the open-source Thompson Sampling Efficient Multi-Objective (TS-EMO).7 Lapkin and co-workers6, 8–10 have demonstrated the quality of the generated Pareto fronts, as well as the algorithm’s efficiency at identifying them, when compared with alternative algorithms such as ParEGO.11 Alternative examples multi-objective algorithms12 developed for chemical process include Phoenics13 and Chimera.14
Figure 2
An illustration of a Pareto front (made up of non-dominated solutions) in a system with two competing optimization objectives, where values in the infeasible region under the Pareto front are inaccessible to the optimization process.
The application of flow chemistry over batch methods for self-optimizing systems has significant advantages. As well as being inherently safer under high temperature and pressure conditions (process intensification conditions), in situ analysis and closed-loop optimization systems are easier to implement in flow conditions as automated direct reaction sampling of the reaction solution can be performed using in-line small volume injectors or using non-invasive spectroscopic sampling. Furthermore, subsequent flow chemistry reactions can be conveniently initiated with different continuous reaction variables by modulating reactor temperatures and flow rates. Conversely, the screening of continuous variables in batch reactions is inefficient, typically requiring expensive robotic equipment.15
Self-optimization flow systems reported in the literature typically utilize custom-designed setups (consisting of pumps, reactors, samplers, and analytical equipment) interfaced with in-house software, which could be detrimental to the widespread adoption and rapid development of these tools. Furthermore, systems are sometimes developed for specific reactions, where modifying a system for a different reaction often requires considerable effort and time, even by experts.16 In contrast, the applications of commercially available modular flow chemistry systems, for example by Vapourtec, have been demonstrated to be effective in conducting many different reactions.17–21 Furthermore, for more complex and scripted applications such as self-optimization, some systems can be remotely controlled through their standard software packages using application programming interfaces (API) written by manufacturers from popular programming environments in languages such as MATLAB or Python.
In this study we aim to further develop autonomous self-optimization flow chemistry systems, by developing a robust implementation, based on commercially available equipment and a proven ML algorithm, suitable for various single-step reaction optimization studies. The system has been demonstrated on a sample reaction exhibiting competing reaction pathways where optimisation of process parameters is known to lead to multiple possible “optimal” sets of reaction conditions. Here we also aim to further investigate the optimization behavior of the TS-EMO Bayesian optimizer with respect to exploitation vs exploration of experimental parameter space.
The case study reaction was the aldol condensation reaction between benzaldehyde (1) and acetone (2), catalyzed by sodium hydroxide (3) base, to give the desired benzylideneacetone (4) product (Scheme 1). The possible side-reactions to form dibenzylideneacetone (5) or acetone polymerization side-products represent an ideal challenge for careful control of reaction conditions chosen by the algorithm.
Scheme 1
Reaction scheme for the sodium hydroxide (3) catalyzed Aldol condensation case study between benzaldehyde (1) and acetone (2) to produce benzylideneacetone (4) at reactor temperature, T, with residence time, tres.
The self-optimization system utilized in this work features exclusively commercially available equipment and the TS-EMO multi-objective optimization algorithm (Figure 3). The flow chemistry equipment consists of two Vapourtec R2 modules and a R4 reactor module for controlling solution flows and reactor temperatures respectively. These parameters are controlled from within the software provided by the manufacturer.
Figure 3
Schematic of the self-optimization systems containing a Vapourtec flow chemistry pumps and reactor, 4-way sample injector, HPLC-UV analysis, and algorithmic reaction optimization, controlled using a MATLAB based environment. BPR: back pressure regulator.
Designed for mesoscale flow chemistry,22 the system uses plug-flow modelling by calculating the flow rates and pump timings in relation to the desired reaction-zone plug sizes, determination of solution compositions within a plug, and automated signaling to reaction samplers and analytical equipment when the system is deemed to have reached steady-state. These features allow for easy implementation of direct reaction mixture sampling at steady-state using a microliter injector into an online HPLC-UV instrument. A bespoke MATLAB user interface was developed to control all aspects of the self-optimization process, including control of physical equipment through interface with commercial software, creation of training data sets, reading HPLC data and calculation of optimization objectives, and the complete, autonomous execution of flow chemistry experiments.23 This process was repeated iteratively until the user terminated the MATLAB environment. It should be noted that any downstream processes, such as purification steps, were not taken into consideration in this work. Therefore, the cost and chemical use in these subsequent processes were not accounted for in the objective calculations.
The four continuous variables optimized in all cases of this study were (i, ii) the molar equivalents of acetone and sodium hydroxide (relative to benzaldehyde), (iii) reactor temperature (T), and (iv) residence time (tres), see Table 1 for user-defined lower and upper limits. Volume of benzaldehyde solution was fixed for each reaction. The upper limit for T was chosen as 70 °C to help avoid acetone polymerization, which results in poorly soluble products that clog the flow path and tubular reactor.24 The residence time limits were set to ensure reactor pressure was not excessive with quicker experiments, whilst keeping total experiments to within 45 mins for longest experiments.
Figure 7
Plots for cost vs yield for specific experiments on the Pareto front related to the self-optimization of aldol condensation reaction in Scheme 1 with limits from Table 1.
The final self-optimization performed in this study targeted reactions conditions that would maximize space-time yield (STY) and minimize the environmental impact using the E-factor metric. Space-time yield is a measure of reactor productivity related to the mass of product 4 formed (mproduct), the reactor volume (Vreactor), and tres (Eq. 3); whilst E-factor29 is defined as the ratio of the mass of waste (mwaste) to mproduct (Eq. 4.
The same initial training set of 20 experiments from the previous optimizations, as well as the same lower and upper variable limits (Table 1) were used to commence the process. It should be noted that a plot of log10(E-factor) against STY, for the original training set was not spread across the optimization space as it had been for the previous optimizations (Figure 8). This suggested that there was no trade-off between the targets and therefore there could be a utopian optimum in this instance. The results after 55 TS-EMO optimization iterations confirmed that there was no Pareto front for these objectives, and instead identified an optimum where STY was 237.43 g L−1 h−1 and E-Factor=39.7 (Figure 8). Like the optimal conditions for the maximum yields in the previous optimizations, the ideal reaction conditions for achieving high STY and low E-factor corresponded to high acetone equivalents (9.94 equiv.), as well as a low tres of 5.1 min. The absence of a Pareto front is due to the closeness of densities of benzaldehyde, acetone and sodium hydroxide solutions (0.795, 0.785 and 0.792 g mL−1 respectively; see ESI for derivation). As the solvent accounts for most of the waste generated, the amount of waste generated between experiments is very similar. This leaves both STY and E-factor being mostly dependent on product quantity, and therefore allowed an optimum result to be identified.
Figure 8
A plot of E-factor against STY for experiments related to the self-optimization of aldol condensation reaction in Scheme 1 with limits from Table 1 (Self-Optimization Three). In this case, there is an optimum solution for these two optimization targets, indicating there is no trade-off between them and therefore no Pareto front is present.
A self-optimization system consisting of a bespoke MATLAB user interface, a commercially available flow chemistry system, sampling and HPLC equipment and a self-optimizing algorithm was built and demonstrated autonomous uninterrupted operation for as many as 131 reactions over 69 hours. The multi-objective optimization algorithm was proven to be able to rapidly exploit the optimization space and locate optimum reaction conditions and key trade-off zones if competing objectives were under investigation. In the aldol condensation case study shown in Scheme 1, multi-objective optimizations to simultaneously maximize yield and minimize cost indicated that these two performance criteria competed with each other and formed a clear Pareto front. In contrast, optimizations to maximum STY and minimize E-factor converged towards a set of optimum reaction conditions.
Given the modularity of the commercial system employed, the flow chemistry setup can be easily modified with different supported components (such as pumps, tubing, reactors, purification modules) and/or additional chemical handlers for reactant loading/product collection. With respect to the handling of discrete variables, such as reagents and solvents, the TS-EMO optimizing algorithm was recently reported to be successful in optimizing for solvents in a ruthenium-catalyzed asymmetric hydrogenation reaction.10 Developments into handling discrete variables are currently underway in our laboratory, with the aim to demonstrate the improved capabilities and efficiencies using robotic workflows in process development.
The project is funded by Pharma Innovation Programme Singapore (PIPS).
The authors declare no conflict of interest.
Follow the link to the article below to click through to the references via the online article