PyCon UK

Registration Office

The story of a 52 year-old database

A historical look at the Cambridge Structural Database – The World’s Repository of Organic Chemical Structures

Stewart Adcock

Monday 17th, 11:30 (Room D)


A talk (25 minutes)

Abstract:
Some notable points looking back along the timeline of database technology:
- 1995 - The first version of MySQL released.
- 1987 - SQL became a standard of the International Organization for Standardization (ISO).
- 1986 - SQL became a standard of the American National Standards Institute (ANSI).
- 1983 - Andreas Reuter and Theo Härder coined the acronym ACID.
- 1974 - The first prototype version of the INGRES relational DBMS released.
- 1971 - The first microprocessor launched - the Intel 4004.
- 1970 - The relational model of data was proposed by E.F. Codd.
- 1967 - Introduction of the DEC PDP-10 computer.
- 1965 - Compilation of the Cambridge Structural Database (CSD) began.

The Cambridge Structural Database (CSD) was initiated as response to an “information explosion” with aims including “the compilation of a numerical database relating to organocarbon crystal structures”. Five decades later, this database is still curated and widely distributed within the scientific community. The CSD is probably the oldest numerical database actively maintained and used today.

In 1966, the database filled one volume of a printed book. In 1991, the database used 196Mbytes. Today, a SQLite version of the database uses 6.4Gbytes. Along the way, the CCDC has driven many computational and scientific advances. E.g., the first practical 3D chemical search system was developed by the CCDC. Learn about past innovations and technical foundations

That is all very interesting, but what does that have to do with Python? The CSD Python API is a mechanism for accessing the data - alongside a wide range of molecular modelling and cheminformatic algorithms. This API is gaining rapid adoption in both academic and industrial chemistry communities. Learn how we can better serve our users thanks to Python.


  • The speaker suggested this session is suitable for new programmers.
  • The speaker suggested this session is suitable for data scientists.

Back to schedule