Data Sources
Transparency about where our compound data comes from and how we process it.
PubChem
Primary Data Source
All compound data displayed on CompoundLookup is sourced from PubChem, the world's largest free chemistry database. PubChem is maintained by the National Center for Biotechnology Information (NCBI), which is part of the National Library of Medicine (NLM) at the National Institutes of Health (NIH).
PubChem Statistics:
- • Over 115 million unique chemical structures
- • Data from 850+ contributing organizations
- • Updated daily with new compounds and data
- • Freely accessible to everyone worldwide
How We Process Data
Data Extraction
We download compound data from PubChem's public FTP server, which includes molecular formulas, IUPAC names, molecular weights, and structural information for millions of compounds.
Element Indexing
We parse each compound's molecular formula to identify which elements it contains. We then create an index that maps element combinations to compounds. This is what enables our unique element-based search functionality.
Database Organization
Compounds are organized into 200,000+ unique element combinations. Each combination (like "Carbon + Hydrogen" or "Oxygen + Nitrogen + Sulfur") has its own searchable page listing all matching compounds.
Regular Updates
We periodically update our database with new compounds from PubChem. This ensures our data stays current with the latest chemical discoveries and additions to PubChem's database.
What Data We Include
For each compound in our database, we display the following information:
Molecular Formula
The chemical formula showing elements and their quantities (e.g., H₂O, C₆H₁₂O₆)
IUPAC Name
The standardized chemical name according to IUPAC nomenclature
Molecular Weight
The mass of one mole of the compound in grams per mole (g/mol)
PubChem CID
The unique identifier for the compound in PubChem's database
How to Cite
If you use data obtained through CompoundLookup in academic work, please cite PubChem as the original data source:
You may also mention CompoundLookup as the tool used to access the data, but the primary citation should always be to PubChem.
Data Accuracy & Limitations
Important Notice
While we strive for accuracy, CompoundLookup should be used for educational and preliminary research purposes. For critical applications, always verify data directly from PubChem or other primary sources.
- • Data may not be 100% complete or up-to-date at all times
- • Some compound names may be simplified or abbreviated
- • Always double-check data for professional or safety-critical use