Finding a Findable Dataset#

About this interactive icons recipe
  • Author(s): Stuart Chalk

  • Topic(s): How and where to find a ‘findable’ chemical dataset

  • Format(s): Interactive Jupyter Notebook (Python)

  • Scenario(s): You are looking for research data to complement your compare with your own data

  • Skill(s): You should be familiar with

  • Learning outcomes: After completing this example you should understand:

    • How to make a request to a website using the Python ‘requests’ functionality

    • Retrieve data in JSON format and how to parse it (knowing the data model)

    • How to store confidential data in a remote file

    • How programmtically you can authenticate to an API (one of many ways)

  • Citation: ‘Finding a Findable Dataset’, Stuart Chalk, The IUPAC FAIR Chemistry Cookbook, Contributed: 2024-02-14 https://w3id.org/ifcc/IFCC013.

  • Reuse: This notebook is made available under a CC-BY-4.0 license.

Scenario#

Our group has a set of thermophysical data on over 8000 chemical substances. We want to integrate into this dataset another physical property dataset so that we can do an analysis of the correlations of the thermophysical data with the chosen physical property of the substances (that are common to both sets).

Criteria for picking the physical property dataset: high quality, trusted, large, available with an open license, so I can publish the results and make the derived dataset open.

  • High quality means: unambiguous identification of each chemical substance, enough contextual information (metadata) to make the values scientifically useful, i.e., at least the composition of the solvent, the temperature and for volatile substances the pressure.

  • Trusted means: the provenance chain is reported with the data, and it shows that the data comes from a reputable source(s) and any aggregation and/or processing is documented in enough detail that the community can understand how the dataset has been created/provided.

Step 1 - Searching PubChem for datasets#

Pubchem houses a lot of data about chemical substances, compounds and bioassays. Over time external organizations have worked with PubChem to include data, in one of a couple of ways:

  • data that has been integrated into PubChem pages (e.g., CCDC -> example)

  • data that is not available in a PubChem page but is available via the data sources section of the site as ‘annotations’ (e.g. RCSB PDB -> Example)

The data available is may not be structured and or clearly described, however if the source has a website with an API then you are likely to get better quality metadata from the linked site.

1.1 - Load the Python functions#

# as these are direct imports (they do not reference a Python package) they are built into Python
import requests
import json

1.2 - Search for sources that have ‘curation efforts’#

# This URL is the metadata about the data sources in PubChem
url = 'https://pubchem.ncbi.nlm.nih.gov/rest/pug/sourcetable/all/JSON/?response_type=display'
response = requests.get(url)
srcs = response.json()
results = []
search = 'Curation Efforts'  # i.e., a repository, or other type of data source (this is in index 8 of the data list for each source)
rows = srcs['Table']['Row']
for row in rows:
    if row['Cell'][8].find(search) != -1:
        hit = {}
        hit.update({'name': row['Cell'][0]})
        hit.update({'url': row['Cell'][9]})
        results.append(hit)
# when printed this is a scrollable list of many entries
print(json.dumps(results, indent=4))
[
    {
        "name": "Agency for Toxic Substances and Disease Registry (ATSDR)",
        "url": "https://www.atsdr.cdc.gov/"
    },
    {
        "name": "Alliance of Genome Resources",
        "url": "https://www.alliancegenome.org/"
    },
    {
        "name": "Analytical Resources Core (ARC), Colorado State University (CSU)",
        "url": "https://www.research.colostate.edu/arc/"
    },
    {
        "name": "Athena Minerals",
        "url": "https://athena.unige.ch/athena/mineral/mineral.html"
    },
    {
        "name": "Barrie Walker, BARK Information Services",
        "url": "https://uk.linkedin.com/in/barrie-walker-85b4a510"
    },
    {
        "name": "BindingDB",
        "url": "https://www.bindingdb.org/rwd/bind/"
    },
    {
        "name": "BioCyc",
        "url": "https://biocyc.org/"
    },
    {
        "name": "BioGRID",
        "url": "https://thebiogrid.org/"
    },
    {
        "name": "BRENDA: Enzyme Functional Data",
        "url": "https://www.brenda-enzymes.org/"
    },
    {
        "name": "CAMEO Chemicals",
        "url": "https://cameochemicals.noaa.gov/"
    },
    {
        "name": "Catalogue of Life (COL)",
        "url": "https://www.catalogueoflife.org/"
    },
    {
        "name": "CCSbase",
        "url": "https://ccsbase.net/"
    },
    {
        "name": "Cell Line Ontology (CLO)",
        "url": ""
    },
    {
        "name": "Cell Ontology (CL)",
        "url": "https://obophenotype.github.io/cell-ontology/"
    },
    {
        "name": "ChEBI",
        "url": "https://www.ebi.ac.uk/chebi/"
    },
    {
        "name": "ChEMBL",
        "url": "https://www.ebi.ac.uk/chembl/"
    },
    {
        "name": "Chemical Probes Portal",
        "url": "https://www.chemicalprobes.org/"
    },
    {
        "name": "ChemIDplus",
        "url": "https://pubchem.ncbi.nlm.nih.gov/source/ChemIDplus"
    },
    {
        "name": "Chemoproteomic Metabolic Pathway Resource, Scripps University",
        "url": ""
    },
    {
        "name": "Chris Southan",
        "url": "https://www.ed.ac.uk/medicine-vet-medicine"
    },
    {
        "name": "Comparative Toxicogenomics Database (CTD)",
        "url": "http://ctdbase.org/"
    },
    {
        "name": "Cosmetic Ingredient Review (CIR)",
        "url": "https://cir-safety.org/"
    },
    {
        "name": "COVID-19 Disease Map",
        "url": "https://covid.pages.uni.lu/"
    },
    {
        "name": "Database of Interacting Proteins (DIP)",
        "url": "https://dip.doe-mbi.ucla.edu/"
    },
    {
        "name": "Drug Gene Interaction database (DGIdb)",
        "url": "https://dgidb.org/"
    },
    {
        "name": "Drug Induced Liver Injury Rank (DILIrank) Dataset",
        "url": "https://www.fda.gov/science-research/liver-toxicity-knowledge-base-ltkb/drug-induced-liver-injury-rank-dilirank-dataset"
    },
    {
        "name": "DrugBank",
        "url": "https://www.drugbank.ca/"
    },
    {
        "name": "DrugCentral",
        "url": "http://drugcentral.org"
    },
    {
        "name": "E. coli Metabolome Database (ECMDB)",
        "url": "https://ecmdb.ca/"
    },
    {
        "name": "Egon Willighagen, Department of Bioinformatics - BiGCaT, Maastricht University",
        "url": "https://www.maastrichtuniversity.nl/el-willighagen"
    },
    {
        "name": "Encyclopedia of Life (EOL)",
        "url": "https://eol.org/"
    },
    {
        "name": "EPA Air Toxics",
        "url": "https://www3.epa.gov/ttn/atw/index.html"
    },
    {
        "name": "EPA DSSTox",
        "url": "https://www.epa.gov/chemical-research/distributed-structure-searchable-toxicity-dsstox-database"
    },
    {
        "name": "FDA Global Substance Registration System (GSRS)",
        "url": "https://www.fda.gov/industry/fda-data-standards-advisory-board/fdas-global-substance-registration-system"
    },
    {
        "name": "FDA Orange Book",
        "url": "https://www.fda.gov/drugs/drug-approvals-and-databases/approved-drug-products-therapeutic-equivalence-evaluations-orange-book"
    },
    {
        "name": "FDA Pharm Classes",
        "url": "https://www.fda.gov/industry/structured-product-labeling-resources/pharmacologic-class"
    },
    {
        "name": "FlyBase",
        "url": "https://flybase.org/"
    },
    {
        "name": "Gene Curation Coalition (GenCC)",
        "url": "https://thegencc.org/"
    },
    {
        "name": "Gene Ontology (GO)",
        "url": "http://geneontology.org/"
    },
    {
        "name": "Genomics of Drug Sensitivity in Cancer (GDSC)",
        "url": "https://www.cancerrxgene.org/"
    },
    {
        "name": "Gloriam Group: Pharmaceutical informatics - Department of Drug Design and Pharmacology, University of Copenhagen",
        "url": ""
    },
    {
        "name": "Glycan Naming and Subsumption Ontology (GNOme)",
        "url": "https://gnome.glyomics.org/"
    },
    {
        "name": "GlycoNAVI",
        "url": "https://glyconavi.org/"
    },
    {
        "name": "GlyConnect",
        "url": "https://glyconnect.expasy.org/"
    },
    {
        "name": "GlyGen",
        "url": "https://www.glygen.org/"
    },
    {
        "name": "Handbook of Mineralogy",
        "url": "https://handbookofmineralogy.org/"
    },
    {
        "name": "Haz-Map, Information on Hazardous Chemicals and Occupational Diseases",
        "url": "https://haz-map.com/"
    },
    {
        "name": "Hazardous Chemical Information System (HCIS), Safe Work Australia",
        "url": "http://hcis.safeworkaustralia.gov.au/"
    },
    {
        "name": "Hazardous Substances Data Bank (HSDB)",
        "url": "https://www.nlm.nih.gov/toxnet/index.html"
    },
    {
        "name": "HUGO Gene Nomenclature Committee (HGNC)",
        "url": "https://www.genenames.org/"
    },
    {
        "name": "Human Metabolome Database (HMDB)",
        "url": "https://hmdb.ca/"
    },
    {
        "name": "Human Protein Atlas (HPA)",
        "url": "https://www.proteinatlas.org/"
    },
    {
        "name": "ILO-WHO International Chemical Safety Cards (ICSCs)",
        "url": "https://chemicalsafety.ilo.org/dyn/icsc/showcard.home"
    },
    {
        "name": "INOH",
        "url": ""
    },
    {
        "name": "IntAct Molecular Interaction Database",
        "url": "https://www.ebi.ac.uk/intact/"
    },
    {
        "name": "InterPro",
        "url": "https://www.ebi.ac.uk/interpro/"
    },
    {
        "name": "IUPAC Digitized pKa Dataset",
        "url": "https://github.com/IUPAC/Dissociation-Constants"
    },
    {
        "name": "IUPAC Periodic Table of the Elements and Isotopes (IPTEI)",
        "url": "https://iupac.org/iptei/"
    },
    {
        "name": "IUPHAR/BPS Guide to PHARMACOLOGY",
        "url": "https://www.guidetopharmacology.org/"
    },
    {
        "name": "KEGG",
        "url": "https://www.genome.jp/kegg/kegg2.html"
    },
    {
        "name": "KNApSAcK Species-Metabolite Database",
        "url": "http://www.knapsackfamily.com/KNApSAcK/"
    },
    {
        "name": "Kruve Lab, Ionization & Mass Spectrometry, Stockholm University",
        "url": "https://kruvelab.com/"
    },
    {
        "name": "Lab and Research Safety, University of Minnesota",
        "url": ""
    },
    {
        "name": "LIPID MAPS",
        "url": "https://lipidmaps.org/"
    },
    {
        "name": "LOTUS - the natural products occurrence database",
        "url": "https://lotus.naturalproducts.net/"
    },
    {
        "name": "MarkerDB",
        "url": "https://markerdb.ca/"
    },
    {
        "name": "Medical Subject Headings (MeSH)",
        "url": "https://www.ncbi.nlm.nih.gov/mesh"
    },
    {
        "name": "Molecular Imaging and Contrast Agent Database (MICAD)",
        "url": "https://www.ncbi.nlm.nih.gov/books/NBK5330/"
    },
    {
        "name": "Molecular Imaging Database (MOLI)",
        "url": ""
    },
    {
        "name": "Mouse Genome Informatics (MGI)",
        "url": "https://www.informatics.jax.org/"
    },
    {
        "name": "Natural Product Activity and Species Source (NPASS)",
        "url": "https://bidd.group/NPASS/"
    },
    {
        "name": "NCI Thesaurus (NCIt)",
        "url": "https://ncit.nci.nih.gov/ncitbrowser/"
    },
    {
        "name": "NEQUIM - Chemoinformatics Group",
        "url": "http://nequim.qui.ufmg.br/"
    },
    {
        "name": "NextMove Software",
        "url": "https://www.nextmovesoftware.com/"
    },
    {
        "name": "NIAID ChemDB",
        "url": "https://chemdb.niaid.nih.gov/"
    },
    {
        "name": "NIST Chemistry WebBook",
        "url": ""
    },
    {
        "name": "NITE-CMC",
        "url": "https://www.nite.go.jp/chem/english/ghs/ghs_index.html"
    },
    {
        "name": "Online Mendelian Inheritance in Man (OMIM)",
        "url": "https://omim.org/"
    },
    {
        "name": "Open Targets",
        "url": "https://www.opentargets.org/"
    },
    {
        "name": "PANTHER",
        "url": "http://www.pantherdb.org/"
    },
    {
        "name": "PathBank",
        "url": "https://pathbank.org/"
    },
    {
        "name": "Pathway Interaction Database",
        "url": ""
    },
    {
        "name": "Pfam",
        "url": "https://www.ebi.ac.uk/interpro/"
    },
    {
        "name": "PharmGKB",
        "url": "https://www.pharmgkb.org/"
    },
    {
        "name": "Pharos",
        "url": "https://pharos.nih.gov/"
    },
    {
        "name": "Pistoia Alliance DataFAIRy Bioassay Pilot",
        "url": "https://www.pistoiaalliance.org/projects/current-projects/datafairy-project/"
    },
    {
        "name": "Plant Reactome",
        "url": "https://plantreactome.gramene.org/index.php?lang=en"
    },
    {
        "name": "PlantCyc",
        "url": "https://www.plantcyc.org/"
    },
    {
        "name": "PomBase: Fission Yeast Resource",
        "url": "https://www.pombase.org/"
    },
    {
        "name": "Protein Ontology",
        "url": "https://proconsortium.org/"
    },
    {
        "name": "Rat Genome Database (RGD)",
        "url": "https://rgd.mcw.edu/"
    },
    {
        "name": "RCSB Protein Data Bank (RCSB PDB)",
        "url": "https://www.rcsb.org/"
    },
    {
        "name": "Reactome",
        "url": "https://reactome.org/"
    },
    {
        "name": "Rhea - Annotated Reactions Database",
        "url": "https://www.rhea-db.org/"
    },
    {
        "name": "Risk Assessment Information System (RAIS)",
        "url": "https://rais.ornl.gov/"
    },
    {
        "name": "RRUFF Project",
        "url": "https://rruff.info/"
    },
    {
        "name": "Saccharomyces Genome Database (SGD)",
        "url": "https://www.yeastgenome.org/"
    },
    {
        "name": "STRING: functional protein association networks",
        "url": "https://string-db.org/"
    },
    {
        "name": "SureChEMBL",
        "url": "https://www.surechembl.org/"
    },
    {
        "name": "Swiss Institute of Bioinformatics Bgee",
        "url": "https://www.bgee.org/"
    },
    {
        "name": "Swiss Institute of Bioinformatics Cellosaurus",
        "url": "https://www.cellosaurus.org/"
    },
    {
        "name": "Swiss Institute of Bioinformatics ENZYME",
        "url": "https://enzyme.expasy.org/"
    },
    {
        "name": "Swiss Institute of Bioinformatics neXtProt",
        "url": "https://www.nextprot.org/"
    },
    {
        "name": "Symbol Nomenclature for Glycans (SNFG) Reference Collection",
        "url": "https://www.ncbi.nlm.nih.gov/glycans/snfg.html"
    },
    {
        "name": "The Cambridge Structural Database",
        "url": ""
    },
    {
        "name": "The National Institute for Occupational Safety and Health (NIOSH)",
        "url": "https://www.cdc.gov/niosh/npg/"
    },
    {
        "name": "The Natural Products Atlas",
        "url": "https://www.npatlas.org/"
    },
    {
        "name": "The University of Alabama Libraries",
        "url": "https://ir.ua.edu/"
    },
    {
        "name": "The Zebrafish Information Network (ZFIN)",
        "url": "https://zfin.org/"
    },
    {
        "name": "Therapeutic Target Database (TTD)",
        "url": "https://db.idrblab.net/ttd/"
    },
    {
        "name": "Toxin and Toxin Target Database (T3DB)",
        "url": "http://www.t3db.ca/"
    },
    {
        "name": "UniCarbKB",
        "url": ""
    },
    {
        "name": "UniProt",
        "url": "https://www.uniprot.org/"
    },
    {
        "name": "UVCBs and Multi-Constituent Substances (MCS) Committee",
        "url": "https://hesiglobal.org/uvcb/"
    },
    {
        "name": "VEuPathDB: The Eukaryotic Pathogen, Vector and Host Informatics Resource",
        "url": "https://veupathdb.org/"
    },
    {
        "name": "Wikidata",
        "url": "https://www.wikidata.org/wiki/Wikidata:Main_Page"
    },
    {
        "name": "Wikipedia",
        "url": "https://en.wikipedia.org/wiki/Main_Page"
    },
    {
        "name": "World Register of Marine Species (WoRMS)",
        "url": "https://www.marinespecies.org/"
    },
    {
        "name": "WormBase",
        "url": "https://wormbase.org/"
    },
    {
        "name": "Xenbase",
        "url": "https://www.xenbase.org/"
    },
    {
        "name": "Yeast Metabolome Database (YMDB)",
        "url": "http://www.ymdb.ca/"
    }
]

Step 2 - Searching FAIR Sharing for datasets#

FAIRSharing is a database of FAIR resources and per se a database of datasets, however you might find a repository here that has the kind of data you are looking for. The code below accesses the FAIR sharing API so search for ‘chemistry’ (or other term) related resources.

Note: To use the code below please go to https://fairsharing.org/accounts/signup, create an account and then enter your username and password in the quotes for ‘fs_user’ and ‘fs_pass’ below.

2.1 - Authentication to the FAIRSharing API#

# see https://fairsharing.org/API_doc for instructions on how to search the API
# user login
fs_user = "ChemCookbook"
fs_pass = "ydt_wdh_MRD*qut5xvq"
url = 'https://api.fairsharing.org/users/sign_in'
loghdrs = {'Accept': 'application/json','Content-Type': 'application/json'}
login = {'user': {'login': fs_user, 'password': fs_pass}}
response = requests.request("POST", url, headers=loghdrs, data=json.dumps(login))
data = response.json()
print(data)
{'success': True, 'jwt': 'eyJhbGciOiJIUzI1NiJ9.eyJqdGkiOiI0YjVkNzQyMi1hOTg3LTRlZWYtYjQyYi1hN2U3Yjc2MTM0ZTEiLCJzdWIiOiI4NTA5Iiwic2NwIjoidXNlciIsImF1ZCI6bnVsbCwiaWF0IjoxNzM1NTcwNjQ4LCJleHAiOjE3MzU2NTcwNDh9.TTEXXG6zzV-r543qQvaxUNBpQo_tZ8vE4VH7oPTgQDo', 'username': 'ChemCookbook', 'id': 8509, 'role': 'user', 'profile_type': 'none', 'watched_records': [], 'is_curator': False, 'is_super_curator': False, 'third_party': False, 'expiry': 1735657048, 'message': 'Authentication successful'}

2.2 - Make the API request#

# in order to authenticate when making an API request the 'jwt' code above must
# be included in the JSON headers (see https://en.wikipedia.org/wiki/List_of_HTTP_header_fields)
jwt = data['jwt']
srchdrs = {'Accept': 'application/json', 'Content-Type': 'application/json', 'Authorization': "Bearer {0}".format(jwt)}
searchterm = 'chemistry'
searchurl ='https://api.fairsharing.org/search/fairsharing_records?q=' + searchterm
search = requests.request("POST", searchurl, headers=srchdrs)
hits = json.loads(search.content)
# this prints out the raw JSON for the first entry (the 'data' entry is a JSON list)
# that is returned from the API request (formatted nicely, which means its on many lines)
print(json.dumps(hits['data'][0], indent=4))
{
    "id": "3524",
    "type": "fairsharing_records",
    "attributes": {
        "created_at": "2018-07-11T19:43:37.000Z",
        "updated_at": "2023-07-13T09:45:16.947Z",
        "metadata": {
            "name": "Chemistry",
            "status": "ready",
            "contacts": [],
            "homepage": "https://www.go-fair.org/implementation-networks/overview/chemistryin/",
            "citations": [],
            "identifier": 3524,
            "description": "A curated collection of standards, databases and policies covering all aspects of chemistry, providing a curated view of FAIRsharing chemistry resources. Although this collection was initiated within the GO FAIR Implementation Network for Chemistry (and in tandem with the Chemistry Research Data Interest Group (CRDIG) of the Research Data Alliance), it has grown beyond this project. Any domain experts who would like to contribute to this collection, within the RDA or beyond, are welcome to claim and help maintain it.",
            "abbreviation": "Chemistry",
            "reference_url": "https://www.go-fair.org/implementation-networks/overview/implementation-networks-archive/chemistryin/"
        },
        "legacy_ids": [
            "bsg-c000049"
        ],
        "name": "FAIRsharing record for: Chemistry",
        "abbreviation": "Chemistry",
        "url": "https://fairsharing.org/fairsharing_records/3524",
        "doi": null,
        "fairsharing_licence": "https://creativecommons.org/licenses/by-sa/4.0/. Please link to https://fairsharing.org and https://api.fairsharing.org/img/fairsharing-attribution.svg for attribution.",
        "description": "This FAIRsharing record describes: A curated collection of standards, databases and policies covering all aspects of chemistry, providing a curated view of FAIRsharing chemistry resources. Although this collection was initiated within the GO FAIR Implementation Network for Chemistry (and in tandem with the Chemistry Research Data Interest Group (CRDIG) of the Research Data Alliance), it has grown beyond this project. Any domain experts who would like to contribute to this collection, within the RDA or beyond, are welcome to claim and help maintain it.",
        "linked_records": [
            {
                "linked_record_name": "QSAR DataBase",
                "linked_record_id": 4904,
                "linked_record_registry": "Database",
                "linked_record_type": "repository",
                "relation": "collects",
                "link_id": 17259
            },
            {
                "linked_record_name": "Nuclear Magnetic Resonance Controlled Vocabulary",
                "linked_record_id": 1021,
                "linked_record_registry": "Standard",
                "linked_record_type": "terminology_artefact",
                "relation": "collects",
                "link_id": 11706
            },
            {
                "linked_record_name": "Chemical Entities of Biological Interest",
                "linked_record_id": 1036,
                "linked_record_registry": "Standard",
                "linked_record_type": "terminology_artefact",
                "relation": "collects",
                "link_id": 11707
            },
            {
                "linked_record_name": "CHEMical INFormation Ontology",
                "linked_record_id": 1056,
                "linked_record_registry": "Standard",
                "linked_record_type": "terminology_artefact",
                "relation": "collects",
                "link_id": 11708
            },
            {
                "linked_record_name": "Suggested Ontology for PHARMacogenomics",
                "linked_record_id": 55,
                "linked_record_registry": "Standard",
                "linked_record_type": "terminology_artefact",
                "relation": "collects",
                "link_id": 11709
            },
            {
                "linked_record_name": "Standards for Reporting Enzymology Data Guidelines",
                "linked_record_id": 1292,
                "linked_record_registry": "Standard",
                "linked_record_type": "reporting_guideline",
                "relation": "collects",
                "link_id": 11710
            },
            {
                "linked_record_name": "CDISC Laboratory Data Model",
                "linked_record_id": 1507,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11711
            },
            {
                "linked_record_name": "Core Information for Metabolomics Reporting",
                "linked_record_id": 665,
                "linked_record_registry": "Standard",
                "linked_record_type": "reporting_guideline",
                "relation": "collects",
                "link_id": 11712
            },
            {
                "linked_record_name": "Chemical Markup Language",
                "linked_record_id": 91,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11713
            },
            {
                "linked_record_name": "MDL Molfile Format",
                "linked_record_id": 1480,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11714
            },
            {
                "linked_record_name": "Minimal Information Required In the Annotation of Models",
                "linked_record_id": 988,
                "linked_record_registry": "Standard",
                "linked_record_type": "reporting_guideline",
                "relation": "collects",
                "link_id": 11715
            },
            {
                "linked_record_name": "CHARMM Card File Format",
                "linked_record_id": 118,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11716
            },
            {
                "linked_record_name": "Minimum Information about a Molecular Interaction Experiment",
                "linked_record_id": 1279,
                "linked_record_registry": "Standard",
                "linked_record_type": "reporting_guideline",
                "relation": "collects",
                "link_id": 11717
            },
            {
                "linked_record_name": "Crystallographic Information Framework",
                "linked_record_id": 183,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11718
            },
            {
                "linked_record_name": "Core Scientific MetaData model",
                "linked_record_id": 391,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11719
            },
            {
                "linked_record_name": "National Drug File",
                "linked_record_id": 155,
                "linked_record_registry": "Standard",
                "linked_record_type": "terminology_artefact",
                "relation": "collects",
                "link_id": 11720
            },
            {
                "linked_record_name": "National Drug Data File",
                "linked_record_id": 1073,
                "linked_record_registry": "Standard",
                "linked_record_type": "terminology_artefact",
                "relation": "collects",
                "link_id": 11721
            },
            {
                "linked_record_name": "Pharmacogenomic Relationships Ontology",
                "linked_record_id": 215,
                "linked_record_registry": "Standard",
                "linked_record_type": "terminology_artefact",
                "relation": "collects",
                "link_id": 11722
            },
            {
                "linked_record_name": "Cancer Chemoprevention Ontology",
                "linked_record_id": 898,
                "linked_record_registry": "Standard",
                "linked_record_type": "terminology_artefact",
                "relation": "collects",
                "link_id": 11723
            },
            {
                "linked_record_name": "Master Drug Data Base Clinical Drugs",
                "linked_record_id": 680,
                "linked_record_registry": "Standard",
                "linked_record_type": "terminology_artefact",
                "relation": "collects",
                "link_id": 11724
            },
            {
                "linked_record_name": "Toxicology Data Markup Language",
                "linked_record_id": 374,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11725
            },
            {
                "linked_record_name": "Minimum Information Required for a Drug Metabolism Enzymes and Transporters Experiment",
                "linked_record_id": 1192,
                "linked_record_registry": "Standard",
                "linked_record_type": "reporting_guideline",
                "relation": "collects",
                "link_id": 11726
            },
            {
                "linked_record_name": "Analytical Information Markup Language",
                "linked_record_id": 1172,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11727
            },
            {
                "linked_record_name": "Joint Committee on Atomic and Molecular Physical data - working group on Data eXchange",
                "linked_record_id": 111,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11728
            },
            {
                "linked_record_name": "Nuclear Magnetic Resonance Markup Language",
                "linked_record_id": 1363,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11729
            },
            {
                "linked_record_name": "Pharmacometrics Markup Language",
                "linked_record_id": 16,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11730
            },
            {
                "linked_record_name": "IUPAC International Chemical Identifier",
                "linked_record_id": 1477,
                "linked_record_registry": "Standard",
                "linked_record_type": "identifier_schema",
                "relation": "collects",
                "link_id": 11731
            },
            {
                "linked_record_name": "eNanoMapper Ontology",
                "linked_record_id": 493,
                "linked_record_registry": "Standard",
                "linked_record_type": "terminology_artefact",
                "relation": "collects",
                "link_id": 11732
            },
            {
                "linked_record_name": "ChemDraw Native File Format",
                "linked_record_id": 1343,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11733
            },
            {
                "linked_record_name": "Minimum Information About a Simulation Experiment",
                "linked_record_id": 1260,
                "linked_record_registry": "Standard",
                "linked_record_type": "reporting_guideline",
                "relation": "collects",
                "link_id": 11734
            },
            {
                "linked_record_name": "Structure Data Format",
                "linked_record_id": 1462,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11735
            },
            {
                "linked_record_name": "Drug Target Ontology",
                "linked_record_id": 1469,
                "linked_record_registry": "Standard",
                "linked_record_type": "terminology_artefact",
                "relation": "collects",
                "link_id": 11736
            },
            {
                "linked_record_name": "ThermoML",
                "linked_record_id": 1091,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11737
            },
            {
                "linked_record_name": "AnaEE Thesaurus",
                "linked_record_id": 1545,
                "linked_record_registry": "Standard",
                "linked_record_type": "terminology_artefact",
                "relation": "collects",
                "link_id": 11738
            },
            {
                "linked_record_name": "Hierarchical Editing Language for Macromolecules",
                "linked_record_id": 1108,
                "linked_record_registry": "Standard",
                "linked_record_type": "model_and_format",
                "relation": "collects",
                "link_id": 11739
            },
            {
                "linked_record_name": "CAS Registry Number",
                "linked_record_id": 381,
                "linked_record_registry": "Standard",
                "linked_record_type": "identifier_schema",
                "relation": "collects",
                "link_id": 11740
            },
            {
                "linked_record_name": "Chemistry vocabulary",
                "linked_record_id": 101,
                "linked_record_registry": "Standard",
                "linked_record_type": "terminology_artefact",
                "relation": "collects",
                "link_id": 11741
            },
            {
                "linked_record_name": "IUPAC Compendium of Chemical Terminology",
                "linked_record_id": 46,
                "linked_record_registry": "Standard",
                "linked_record_type": "terminology_artefact",
                "relation": "collects",
                "link_id": 11742
            },
            {
                "linked_record_name": "Springer Nature Research Data Policy Type 1 (legacy)",
                "linked_record_id": 3360,
                "linked_record_registry": "Policy",
                "linked_record_type": "journal_publisher",
                "relation": "collects",
                "link_id": 11743
            },
            {
                "linked_record_name": "Springer Nature Research Data Policy Type 4 (legacy)",
                "linked_record_id": 3415,
                "linked_record_registry": "Policy",
                "linked_record_type": "journal_publisher",
                "relation": "collects",
                "link_id": 11744
            },
            {
                "linked_record_name": "American Chemical Society - Accounts of Chemical Research  - Information for Authors",
                "linked_record_id": 3400,
                "linked_record_registry": "Policy",
                "linked_record_type": "journal",
                "relation": "collects",
                "link_id": 11745
            },
            {
                "linked_record_name": "American Chemical Society - Chemical Reviews - Author Guidelines",
                "linked_record_id": 3472,
                "linked_record_registry": "Policy",
                "linked_record_type": "journal",
                "relation": "collects",
                "link_id": 11746
            },
            {
                "linked_record_name": "Springer Nature Research Data Policy Type 2 (legacy)",
                "linked_record_id": 3425,
                "linked_record_registry": "Policy",
                "linked_record_type": "journal_publisher",
                "relation": "collects",
                "link_id": 11747
            },
            {
                "linked_record_name": "Hindawi Research Data Policy",
                "linked_record_id": 3412,
                "linked_record_registry": "Policy",
                "linked_record_type": "journal",
                "relation": "collects",
                "link_id": 11748
            },
            {
                "linked_record_name": "Nature - Nature Chemical Biology - Reporting standards and availability of data, materials, code and protocols",
                "linked_record_id": 3486,
                "linked_record_registry": "Policy",
                "linked_record_type": "journal",
                "relation": "collects",
                "link_id": 11749
            },
            {
                "linked_record_name": "Springer Nature Research Data Policy Type 3 (for life sciences) (legacy)",
                "linked_record_id": 3416,
                "linked_record_registry": "Policy",
                "linked_record_type": "journal_publisher",
                "relation": "collects",
                "link_id": 11750
            },
            {
                "linked_record_name": "Royal Society of Chemistry - Data policy",
                "linked_record_id": 3410,
                "linked_record_registry": "Policy",
                "linked_record_type": "journal",
                "relation": "collects",
                "link_id": 11751
            },
            {
                "linked_record_name": "Nature - Nature Chemistry - Reporting standards and availability of data, materials, code and protocols",
                "linked_record_id": 3371,
                "linked_record_registry": "Policy",
                "linked_record_type": "journal",
                "relation": "collects",
                "link_id": 11752
            },
            {
                "linked_record_name": "Springer Nature Research Data Policy Type 3 (legacy)",
                "linked_record_id": 3426,
                "linked_record_registry": "Policy",
                "linked_record_type": "journal_publisher",
                "relation": "collects",
                "link_id": 11753
            },
            {
                "linked_record_name": "American Society of Plant Biologists - The Plant Cell - Author Guidelines",
                "linked_record_id": 3446,
                "linked_record_registry": "Policy",
                "linked_record_type": "journal",
                "relation": "collects",
                "link_id": 11754
            },
            {
                "linked_record_name": "Cell Press - Trends In Biochemical Sciences - Instructions for Authors",
                "linked_record_id": 3361,
                "linked_record_registry": "Policy",
                "linked_record_type": "journal",
                "relation": "collects",
                "link_id": 11755
            },
            {
                "linked_record_name": "GlycoNAVI",
                "linked_record_id": 1548,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11756
            },
            {
                "linked_record_name": "canSAR",
                "linked_record_id": 1559,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11757
            },
            {
                "linked_record_name": "Crystallography Open Database",
                "linked_record_id": 1562,
                "linked_record_registry": "Database",
                "linked_record_type": "repository",
                "relation": "collects",
                "link_id": 11758
            },
            {
                "linked_record_name": "MINAS - A Database of Metal Ions in Nucleic AcidS",
                "linked_record_id": 1607,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11759
            },
            {
                "linked_record_name": "Polbase",
                "linked_record_id": 1632,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase_and_repository",
                "relation": "collects",
                "link_id": 11760
            },
            {
                "linked_record_name": "Protein-Chemical Structural Interactions",
                "linked_record_id": 1636,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11761
            },
            {
                "linked_record_name": "Rhea",
                "linked_record_id": 1639,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11762
            },
            {
                "linked_record_name": "Syntheses, Chemicals, and Reactions In Patents DataBase",
                "linked_record_id": 1642,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11763
            },
            {
                "linked_record_name": "Search Tool for Interactions of Chemicals",
                "linked_record_id": 1648,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase_and_repository",
                "relation": "collects",
                "link_id": 11764
            },
            {
                "linked_record_name": "SuperTarget",
                "linked_record_id": 1691,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11765
            },
            {
                "linked_record_name": "TDR Targets",
                "linked_record_id": 1692,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11766
            },
            {
                "linked_record_name": "Comparative Toxicogenomics Database",
                "linked_record_id": 1716,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11767
            },
            {
                "linked_record_name": "Chemical Abstracts Service Registry",
                "linked_record_id": 1747,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11768
            },
            {
                "linked_record_name": "The Cambridge Structural Database",
                "linked_record_id": 1796,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11769
            },
            {
                "linked_record_name": "Chemical Component Dictionary",
                "linked_record_id": 1847,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11770
            },
            {
                "linked_record_name": "Small Molecule Pathway Database",
                "linked_record_id": 1887,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase_and_repository",
                "relation": "collects",
                "link_id": 11771
            },
            {
                "linked_record_name": "SABIO-RK Biochemical Reaction Kinetics Database",
                "linked_record_id": 1891,
                "linked_record_registry": "Database",
                "linked_record_type": "repository",
                "relation": "collects",
                "link_id": 11772
            },
            {
                "linked_record_name": "J-GLOBAL",
                "linked_record_id": 1936,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11773
            },
            {
                "linked_record_name": "Three-Dimensional Structure Database of Natural Metabolites",
                "linked_record_id": 1999,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11774
            },
            {
                "linked_record_name": "ChemIDplus",
                "linked_record_id": 2003,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11775
            },
            {
                "linked_record_name": "ChemSpider",
                "linked_record_id": 2042,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11776
            },
            {
                "linked_record_name": "The Human Metabolome Database",
                "linked_record_id": 2084,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11777
            },
            {
                "linked_record_name": "The UC Irvine ChemDB",
                "linked_record_id": 2088,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11778
            },
            {
                "linked_record_name": "eCrystals",
                "linked_record_id": 2108,
                "linked_record_registry": "Database",
                "linked_record_type": "repository",
                "relation": "collects",
                "link_id": 11779
            },
            {
                "linked_record_name": "SigMol",
                "linked_record_id": 2241,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase_and_repository",
                "relation": "collects",
                "link_id": 11780
            },
            {
                "linked_record_name": "MetaNetX",
                "linked_record_id": 2281,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11781
            },
            {
                "linked_record_name": "data.eNanoMapper.net",
                "linked_record_id": 2282,
                "linked_record_registry": "Database",
                "linked_record_type": "repository",
                "relation": "collects",
                "link_id": 11782
            },
            {
                "linked_record_name": "Wikidata",
                "linked_record_id": 2329,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11783
            },
            {
                "linked_record_name": "Open Targets",
                "linked_record_id": 2369,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11784
            },
            {
                "linked_record_name": "SureChEMBL",
                "linked_record_id": 2378,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11785
            },
            {
                "linked_record_name": "EarthChem",
                "linked_record_id": 2424,
                "linked_record_registry": "Database",
                "linked_record_type": "repository",
                "relation": "collects",
                "link_id": 11786
            },
            {
                "linked_record_name": "EPA Comptox Chemicals Dashboard",
                "linked_record_id": 2480,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11787
            },
            {
                "linked_record_name": "High Energy Physics Data Repository",
                "linked_record_id": 2497,
                "linked_record_registry": "Database",
                "linked_record_type": "repository",
                "relation": "collects",
                "link_id": 11788
            },
            {
                "linked_record_name": "CSIRO Data Access Portal",
                "linked_record_id": 2508,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11789
            },
            {
                "linked_record_name": "FlavorDB",
                "linked_record_id": 2516,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11790
            },
            {
                "linked_record_name": "Target Pathogen",
                "linked_record_id": 2522,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11791
            },
            {
                "linked_record_name": "CloudFlame",
                "linked_record_id": 2531,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11792
            },
            {
                "linked_record_name": "4TU.ResearchData",
                "linked_record_id": 2537,
                "linked_record_registry": "Database",
                "linked_record_type": "repository",
                "relation": "collects",
                "link_id": 11793
            },
            {
                "linked_record_name": "Citrination",
                "linked_record_id": 2538,
                "linked_record_registry": "Database",
                "linked_record_type": "repository",
                "relation": "collects",
                "link_id": 11794
            },
            {
                "linked_record_name": "ThermoML Archive",
                "linked_record_id": 2545,
                "linked_record_registry": "Database",
                "linked_record_type": "repository",
                "relation": "collects",
                "link_id": 11795
            },
            {
                "linked_record_name": "SuperDRUG2 - A One Stop Resource for Approved/Marketed Drugs",
                "linked_record_id": 2590,
                "linked_record_registry": "Database",
                "linked_record_type": "knowledgebase",
                "relation": "collects",
                "link_id": 11796
            },
            {
                "linked_record_name": "NASA Ames PAH IR Spectroscopic Database",
                "linked_record_id": 2619,
                "linked_record_registry": "Database",
                "linked_record_type": "repository",
                "relation": "collects",
                "link_id": 11797
            },
            {
                "linked_record_name": "NIST Atomic Spectra Database",
                "linked_record_id": 2621,
                "linked_record_registry": "Database",
                "linked_record_type": "repository",
                "relation": "collects",
                "link_id": 11798
            },
            {
                "linked_record_name": "Kinetic Database for Astrochemistry",
                "linked_record_id": 2622,
                "linked_record_registry": "Database",
                "linked_record_type": "repository",
                "relation": "collects",
                "link_id": 11799
            }
        ],
        "linking_records": [],
        "fairsharing_registry": "Collection",
        "record_type": "collection",
        "subjects": [
            "Chemistry"
        ],
        "domains": [],
        "taxonomies": [
            "All"
        ],
        "user_defined_tags": [],
        "countries": [
            "Worldwide"
        ],
        "publications": [],
        "licence_links": [],
        "grants": [
            {
                "id": 10157,
                "fairsharing_record_id": 3524,
                "organisation_id": 3794,
                "relation": "associated_with",
                "created_at": "2022-11-28T13:52:29.395Z",
                "updated_at": "2022-11-28T13:52:29.395Z",
                "grant_id": null,
                "is_lead": false,
                "saved_state": {
                    "id": 3794,
                    "name": "RDA (Research Data Alliance) Chemistry Research Data Interest Group (CRDIG)",
                    "types": [
                        "Consortium"
                    ],
                    "is_lead": false,
                    "relation": "associated_with"
                }
            },
            {
                "id": 10158,
                "fairsharing_record_id": 3524,
                "organisation_id": 1187,
                "relation": "associated_with",
                "created_at": "2022-11-28T13:52:29.442Z",
                "updated_at": "2022-11-28T13:52:29.442Z",
                "grant_id": null,
                "is_lead": false,
                "saved_state": {
                    "id": 1187,
                    "name": "GO FAIR",
                    "types": [
                        "Consortium"
                    ],
                    "is_lead": false,
                    "relation": "associated_with"
                }
            }
        ],
        "url_for_logo": null,
        "exhaustive_licences": false
    }
}

2.3 - Output the data in a presentable format#

# here we loop over the data that has been returned and print it out, one per line
for hit in hits['data']:
    print(hit['attributes']['name'] + ": " + hit['attributes']['url'])
FAIRsharing record for: Chemistry: https://fairsharing.org/fairsharing_records/3524
FAIRsharing record for: Royal Society of Chemistry - Data policy: https://fairsharing.org/10.25504/FAIRsharing.egbgwm
FAIRsharing record for: Portable reduced-precision binary format for trajectories produced by GROMACS package.: https://fairsharing.org/10.25504/FAIRsharing.cb1adb
FAIRsharing record for: Chemistry vocabulary: https://fairsharing.org/10.25504/FAIRsharing.TrcBD2
FAIRsharing record for: EMODnet Chemistry: https://fairsharing.org/10.25504/FAIRsharing.KOiDmy
FAIRsharing record for: BindingDB database of measured binding affinities: https://fairsharing.org/10.25504/FAIRsharing.3b36hk
FAIRsharing record for: ioChem-BD: https://fairsharing.org/10.25504/FAIRsharing.lwW6a1
FAIRsharing record for: MINAS - A Database of Metal Ions in Nucleic AcidS: https://fairsharing.org/10.25504/FAIRsharing.wqtfkv
FAIRsharing record for: Beilstein Journal of Organic Chemistry: https://fairsharing.org/10.25504/FAIRsharing.7GA79k
FAIRsharing record for: ChemSpider: https://fairsharing.org/10.25504/FAIRsharing.96f3gm
FAIRsharing record for: EPA Comptox Chemicals Dashboard: https://fairsharing.org/10.25504/FAIRsharing.tfj7gt
FAIRsharing record for: Chemical Abstracts Service Common Chemistry: https://fairsharing.org/10.25504/FAIRsharing.e5c86e
FAIRsharing record for: Chemical Markup Language: https://fairsharing.org/10.25504/FAIRsharing.3mdt9n
FAIRsharing record for: CHARMM Card File Format: https://fairsharing.org/10.25504/FAIRsharing.7hp91k
FAIRsharing record for: CAS Registry Number: https://fairsharing.org/10.25504/FAIRsharing.r7Kwy7
FAIRsharing record for: Elsevier / American Society for Biochemistry and Molecular Biology - Journal of Biological Chemistry - Data Policy: https://fairsharing.org/fairsharing_records/5290
FAIRsharing record for: NFDI4Chem : https://fairsharing.org/fairsharing_records/5027
FAIRsharing record for: Data Standard for Sharing Quantitative Results in Mass Spectrometry Metabolomics: https://fairsharing.org/10.25504/FAIRsharing.207caf
FAIRsharing record for: IUPAC International Chemical Identifier: https://fairsharing.org/10.25504/FAIRsharing.ddk9t9
FAIRsharing record for: Dot Bracket Notation (DBN) - Vienna Format: https://fairsharing.org/10.25504/FAIRsharing.4xrzw1
FAIRsharing record for: VDJdb: a curated database of T-cell receptors with known antigen specificity: https://fairsharing.org/10.25504/FAIRsharing.nwz68
FAIRsharing record for: Reaction InChI: https://fairsharing.org/10.25504/FAIRsharing.58b6f9
FAIRsharing record for: Molecular Process Ontology: https://fairsharing.org/10.25504/FAIRsharing.mct09a
FAIRsharing record for: Name Reaction Ontology: https://fairsharing.org/10.25504/FAIRsharing.w4tncg
FAIRsharing record for: Chemical Methods Ontology: https://fairsharing.org/10.25504/FAIRsharing.9j4wh2