Datasets available for Datahub/DSMLP
The datasets listed below are available for use in DataHub by opening a terminal window (while in your DataHub environment) and typing "cd /datasets". In your jupyter notebook, the path for reading in the dataset would be "/datasets/
Several of the datasets below are part of the Library's UC San Diego Educational Dataset Service Collection (item list). To request additional datasets from this collection, please email datahub@ucsd.edu. You may also contact us if you would like to have a private dataset (as most things in /datasets are publicly readable).
Name of Dataset in DataHub | Title/Details | URL (More Information) |
---|---|---|
BindingDB Dataset, January 1, 2024 | "BindingDB Dataset, January 1, 2024. In BindingDB: Measured Binding Data for Protein-Ligand and Other Molecular Systems. Data downloaded from component 4: TSV file containing all protein-ligand data in BindingDB" | https://doi.org/10.6075/J0BP02ZT |
California Local Tax Ballot Measures, 1986 to 2012 | "California Local Tax Ballot Measures, 1986 to 2012. In California Local Tax Ballot Measures. Data downloaded from component 1: Data" | https://doi.org/10.6075/J09P2ZT7 |
Cars Overhead with Context (COWC) | "Cars Overhead with Context (COWC). In Lawrence Livermore National Laboratory (LNLL) Open Data Initiative. Data downloaded from component 8: COWC-M datasets and networks" | https://doi.org/10.6075/J0CN72BC |
Data from: Genome-Wide Association Study in 3,173 Outbred Rats Identifies Multiple Loci for Body Weight, Adiposity, and Fasting Glucose | "Data from: Genome-Wide Association Study in 3,173 Outbred Rats Identifies Multiple Loci for Body Weight, Adiposity, and Fasting Glucose. In The Center for GWAS in Outbred Rats Database (C-GORD). Data downloaded from components 2 to 6" | https://doi.org/10.6075/J0Q240F0 |
Data from: Multi-Source Feature Fusion for Object Detection Association in Connected Vehicle Environments | "Data from: Multi-Source Feature Fusion for Object Detection Association in Connected Vehicle Environments. Data downloaded from components 3 to 5" | https://doi.org/10.6075/J0HX1CVJ |
Data from: Quantifying influence of human choice on the automated detection of Drosophila behavior by a supervised machine learning algorithm | "Data from: Quantifying influence of human choice on the automated detection of Drosophila behavior by a supervised machine learning algorithm. Data downloaded from components 12 to 32, under Movies and associated files" | https://doi.org/10.6075/J0QF8RDZ |
Parcels, San Diego County, California, 2023 (December) | "Parcels, San Diego County, California, 2023 (December). In San Diego County GIS Data. Data downloaded from component 1: Shapefiles" | https://doi.org/10.6075/J0KH0NJX |
Training image data for: Environmental and ecological drivers of harmful algal blooms in the Southern California Bight | "Data from: Multi-Source Feature Fusion for Object Detection Association in Connected Vehicle Environment Data downloaded from components 1 and 2" | https://doi.org/10.6075/J00865GT |
Using NLP to Predict the Severity of Cyber Security Vulnerabilities | "Using NLP to Predict the Severity of Cyber Security Vulnerabilities. In Data Science & Engineering Master of Advanced Study (DSE MAS) Capstone Projects. Data downloaded from component 4: Input data" | https://doi.org/10.6075/J0TX3F89 |
Video Game Reviews Sentiment to Popularity | "Video Game Reviews Sentiment to Popularity. In Data Science & Engineering Master of Advanced Study (DSE MAS) Capstone Projects. Data downloaded from component 4: Input file" | https://doi.org/10.6075/J06D5T5H |