30 Years of Data for Life
Building rich databases before Big Data
For more than three decades, Eurofins has been considered a leading provider of analytical services globally. The Group performs hundreds of millions of tests each year to establish the safety, identity, composition, authenticity, origin, traceability, and purity of biological substances and products. Its extensive databanks offer years’ worth of information about pharmaceuticals and food and their properties.
Many of the tests that Eurofins performs around the world rely on one of the company’s extensive proprietary databases, not simply to compare results to, but very often to obtain the results themselves.
Eurofins’ DNA fingerprint database, for example, contains unique identifying characteristics (the “fingerprints”) of foodstuffs and enables proof of authenticity that had not previously been available. Not only does this prevent unscrupulous suppliers gaining an unfair market advantage, but it also prevents them endangering the health of the consumer.
Eurofins developed its proprietary DNA fingerprint databases for several specific analytical requests (e.g. basmati / fragrant rice authenticity or determination of different pine nut species), using proprietary or published DNA fingerprint methods and reference samples which were made available by authorities or via Eurofins’ laboratory network and its customers. The databases allow identification of pure and mixed samples, but also of the presence of as-yet unknown or unapproved species or varieties. The testing methods meet the need for traceability – a major requirement of EU food legislation – and can be adapted very fast to the needs of the market, for example if new species or varieties are approved by authorities and reference material is available.
Another hugely exciting step-change in proprietary databases came with the launch in 1999 of Eurofins’ BioPrint™ databases to improve and optimise the selection of drug candidates. The database comprises a large and homogeneous set of experimental data, generated in-house and containing more than 2,400 compounds, including marketed drugs, compounds which failed in clinical trials, and reference compounds.
Each compound has hundreds of pieces of information stored, with the BioPrint™ database covering in vitro assays as well as in vivo characteristics such as drug reactions, pharmacokinetics and therapeutic indications. On average there are 400 records for each compound, meaning more than 1 million records are stored. High quality and extensive datasets, combined with modelling and mining tools, place new drug candidates in the context of well-understood drugs. This allows scientists to anticipate adverse drug reactions and supports lead compound characterisation and prioritisation (a lead is a possible drug candidate but which may still have suboptimal structure and characteristics).
A further development is an analytical solution from Eurofins QTA that provides a major advancement in the way customers perform their infrared (IR) analysis. IR spectroscopy allows for fast analysis, even outside a laboratory environment, for simple parameters and is widely applied for raw material checks. The backbone is a proprietary database of (IR) spectra. Spectroscopy is the study of the interactions between matter and electromagnetic radiation. When an electromagnetic wave passes through a molecule, the molecule absorbs the energy at a specific frequency and, because different molecules absorb energy at different frequencies, the collected spectrum can be used to identify the materials. QTA® uses near infrared (NIR) spectroscopy and mid-infrared (MIR) spectroscopy technologies in its qualitative analysis services.
The QTA® methods provide a simple interface for non-skilled users, thereby overcoming the challenges of traditional infrared analyses which have to be developed and maintained by experienced spectroscopists or chemometricians.
Instruments at the testing site scan a sample and send the light spectra of the sample to the secure server via the internet. The analysis and data interpretation are completed within minutes on the server, and the results are returned in real time. Eurofins QTA has developed customised testing methods for beer, hops, sugar cane juice, feed, fertilisers, and formulated pesticides.
The science behind
Information about the origin of a food product is often encrypted in its chemical composition, and rapid developments in science and technology over the last few decades allow its analysis and interpretation. Two of the most important techniques within DNA fingerprinting are DNA fragment length analysis and microsatellite or short tandem repeat (STR) analysis. DNA fragment length analysis considers changes in the length of a specific DNA sequence to indicate the presence or absence of a genetic marker. Eurofins used this technique to successfully detect a species of poor-tasting pine nuts which had triggered 39 biotoxin notifications in the EU Rapid Alert System for food and feed. STR analysis is used to compare specific areas on DNA from two or more samples. This technique, again based on Eurofins’ extensive proprietary database of specific DNA fingerprints, was used by the company to prove the authenticity of Basmati rice when the market was flooded with cheap imitations.
The BioPrint™ project was started with the hypothesis that the in vitro pharmacological profiles of new drug candidates generated in Eurofins’ laboratory could act as a fingerprint, capturing information on the in vivo activity of the compound. The company found that hierarchical clustering of the drug and reference compounds based on their in vitro pharmacological profiles, achieved the grouping of many by their therapeutic areas or biological actions; for example, antidepressants clustered with other antidepressants, and antifungals with other antifungal drugs. Using this “fingerprint”, choices about a drug candidate’s potential therapeutic use, and adverse reactions in the context of all the drug and reference compounds present in the database by performing simple profile similarity analysis to identify “neighbour” compounds.
The backbone of the QTA® solution is a proprietary database developed using Chingometrics methods which cover all Eurofins QTA applications. Chingometrics are unique data treatment methodologies applied to calibration models for spectroscopic qualitative analysis applications. The proprietary database and algorithms are dynamically maintained for superior accuracy and precision, with primary data generated using industrial standard methods and stored in a highly-secure central server.