Understanding the formation, structures, and properties of emulsions is essential to the creation and stabilization of food products. Emulsion technology allows the design of new products with desired physicochemical properties: texture, taste, flavor, and stability. The emulsion stability and lifetime depend on its components and preparation methods. Experiment-based formulation development requires significant resources as several design parameters need to be evaluated. Virtual experiments can reduce design costs by efficiently screening key parameters, such as the interfacial tension between two phases.
Improvements in the emulsion formulations require a better knowledge of the factors governing the physicochemical properties of the organic molecules in contact with the water phase. Surfactants are organic molecules used to reduce the interfacial tension between the dispersed and continuous phase for better emulsion distribution. Virtual screening methods based on quantitative structure-property relationship (QSPR) analysis can be employed to establish correlations between the structure and physical, chemical, biological or environmental properties of the compounds combining statistical modeling with chemical information using the so-called descriptors. A prerequisite is the availability of a sufficiently large consistent data set that can be used to train and validate the QSPR model.
In the present case study, QSPR analysis was carried out for establishing a relationship between a data set of small organic molecules (including fatty acids and lipids) with known interfacial tension in contact with water at standard conditions (25 °C and 1 atm). The calculations were performed using the alvaDesc plugin within the MAPS platform.
The training set used for the model development consisted of 67 organic molecules. The dependent variable is the interfacial tension at ambient conditions. The final QSPR model was selected based on the significant values of different statistical parameters: high determination coefficient (R2=0.8), small RMSE (6.0427) and a small number of descriptors to avoid overfitting of the model, which would restrict its predictability.
The figure shows the experimental vs. calculated values of interfacial tension for all the compounds in the training set. Four descriptors from the families: topological indices, Information indices, ETA indices, and Edge adjacency indices, were found to be the most important for correlating the interfacial tension data. These descriptors are related to the electrotopological state, neighborhood symmetry, the hydrogen bond donor atoms, and the dipole moment. Due to the limited size of the training set, the leave-one out validation method was used.
QSPR modeling can efficiently support and guide experimental research by creating digital twins of target compounds, predict their properties, and identify the most promising candidates, which can drastically reduce the development time and cost. Moreover, such results can be further used to compute the lifetime of the emulsion using the Cahn-Hilliard equation, and in turn, identify the optimum formulation of the product with an extended lifetime.