polyBERT is a system that treats the chemical structure of polymers like a chemical language: each word that can be formed in this language is a unique name for a theoretically possible polymer. The molecular building blocks and structures of respective polymers are reflected in these names. Building on new insights from linguistics and computer science, polyBERT has been trained and developed to a learning system by the research team in Bayreuth and Atlanta.
From polymer language to digital "fingerprints"
In a first step, polyBERT has learned the names of about 100 million theoretically possible polymers. These names are combinations of molecular units contained in approximately 13,000 polymers. The training of polyBERT makes it understand the polymer language, and correctly identify building blocks and structures of about 100 million polymers. The learning digital system can even use the polymer language on its own. This means that polyBERT can generate further names of previously unknown but theoretically possible polymers.
Linked to the chemical language expertise is another capability: polyBERT automatically translates polymer names that it knows into numerical representations, so-called "fingerprints". Each fingerprint is a unique code word consisting of numbers from which the building blocks and structure of the respective polymer can be inferred. This automatic generation of digital fingerprints is far less error-prone and much faster than human-generated fingerprints for each chemical structure of polymers.
Rapid and precise prediction of polymer properties
polyBERT derives its enormous practical relevance from the teaching process, by the researchers in Bayreuth and Atlanta, about numerous characteristic polymer properties that are particularly relevant for technological applications. The system is therefore able to unambiguously correlate fingerprints and properties of polymers. Novel techniques from the field of artificial intelligence enable polyBERT to precisely select, with high accuracy and at unprecedented speed, those polymers required for specific applications from the 100 million theoretically possible polymers. "polyBERT is an exceptionally high-performance system for rapid and accurate prediction of polymer properties. Therefore, our research has the potential to significantly accelerate the design, synthesis and technological application of polymers," says Kuenneth.
Past study identifies bioplastics
The importance of machine learning approaches to polymer research is already demonstrated by a past study that Kuenneth published in the journal Communications Materials in December 2022. Here, he and research partners at Atlanta and the Los Alamos National Laboratories in the United States present a similar artificial neural network-based system for predicting polymer properties. This system is capable of countering global plastic waste pollution. About 75 percent of industrially produced plastics are based on fossil raw materials. The new system can significantly accelerate the search for biopolymers which can replace these plastics: The authors of the study identified 14 biologically producible and degradable polymers from 1.4 million possible candidates that can replace the current industrial plastics as soon as fast and cost-effective synthesis processes become available.
Publications:
Christopher Kuenneth, Rampi Ramprasad: polyBERT: a chemical language model to enable fully machine-driven ultrafast polymer informatics. Nature Communications (2023), DOI: https://doi.org/10.1038/s41467-023-39868-6
Christopher Kuenneth, Jessica Lalonde, Babetta L. Marrone, Carl N. Iverson, Rampi Ramprasad, Ghanshyam Pilania: Bioplastic design using multitask deep neural networks. Communication Materials (2022), DOI: https://doi.org/10.1038/s43246-022-00319-2
Jennifer Opel: The mixture makes the difference. Prof. Dr. Christopher Künneth takes over the professorship for Computational Materials Science for the Faculty of Engineering. UBT aktuell, March 2023. https://ubtaktuell.uni-bayreuth.de/en/christopher-kuenneth