- Amazon Web Services Data: http://aws.amazon.com/datasets
- Airlines Data (2009 ASA Challenge): http://stat-computing.org/dataexpo/2009/the-data.html
- AppliedPredictiveModeling (R package): http://bit.ly/16wyvkG
- Australian Weather: http://www.bom.gov.au/climate/dwo/
- Causality Workbench: http://www.causality.inf.ethz.ch/repository.php
- Kaggle competition data: http://www.kaggle.com/
- KDNuggets competition site: http://www.kdnuggets.com/datasets/
- The Koblenz Network Collection: http://konect.uni-koblenz.de/
- Machine Learning Data Set Repository: http://mldata.org/
- Medicare Data File: http://go.cms.gov/19xxPN4
- Microsoft Research: http://research.microsoft.com/apps/dp/dl/downloads.aspx
- Million Song Dataset: http://blog.echonest.com/post/3639160982/million-song-dataset
- More song datasets: http://labrosa.ee.columbia.edu/millionsong/pages/additional-datasets
- MovieLens Data Sets: http://datahub.io/dataset/movielens
- NYC Taxi Data (2010-2013): http://publish.illinois.edu/dbwork/open-data/
- RDataMining.com R and Data Mining ebook data: http://www.rdatamining.com/data
- Social Networking: http://www.cs.cmu.edu/~jelsas/data/ancestry.com/
- UCI Machine Learning Repository: http://archive.ics.uci.edu/ml/
- 53.5 billion clicks: http://cnets.indiana.edu/groups/nan/webtraffic/click-dataset
- Stanford Large Network Dataset Collection: http://snap.stanford.edu/data/
- Data360: http://www.data360.org/index.aspx
- Factual: http://www.factual.com/
- Freebase: http://www.freebase.com/
- Google: http://www.google.com/publicdata/directory
- infochimps: http://www.infochimps.com/
- Quora: http://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public
- RS Collection 100+ : http://rs.io/2014/05/29/list-of-data-sets.html
- Sample R data sets: http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/00Index.html
(R!)
- SourceForge Research Data: http://www.nd.edu/~oss/Data/data.html
- StatSci.org: http://www.statsci.org/datasets.html
- UFO Reports: http://www.nuforc.org/webreports.html
- Wikileaks 911 pager intercepts: http://911.wikileaks.org/files/index.html
- The Washington Post List: http://www.washingtonpost.com/wp-srv/metro/data/datapost.html
- Agricultural Experiments: http://www.inside-r.org/packages/cran/agridat/docs/agridat
(R!)
- Climate data: http://www.cru.uea.ac.uk/cru/data/temperature/#datter and ftp://ftp.cmdl.noaa.gov/
- Gene Expression Omnibus: http://www.ncbi.nlm.nih.gov/geo/
- Geo Spatial Data: http://geodacenter.asu.edu/datalist/
- Human Microbiome Project: http://www.hmpdacc.org/reference_genomes/reference_genomes.php
- MIT Cancer Genomics Data: http://www.broadinstitute.org/cgi-bin/cancer/datasets.cgi
- NASA: http://nssdc.gsfc.nasa.gov/nssdc/obtaining_data.html
- NIH Microarray data: ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE6532/
(R!)
- Protein structure: http://www.infobiotic.net/PSPbenchmarks/
- Public Gene Data: http://www.pubgene.org/
- Stanford Microarray Data: http://smd.stanford.edu//
- General Social Survey: http://www3.norc.org/GSS+Website/
- ICPSR: http://www.icpsr.umich.edu/icpsrweb/ICPSR/index.jsp
- Pew Research: http://www.pewinternet.org/datasets/pages/2/
- SNAP: http://snap.stanford.edu/data/index.html
- UCLA Social Sciences Archive: http://dataarchives.ss.ucla.edu/Home.DataPortals.htm
- UPJOHN INST: Search for data at http://www.upjohn.org
- Time Series data Library: http://robjhyndman.com/TSDL/
- Carnegie Mellon University Enron email: http://www.cs.cmu.edu/~enron/
- Keel Repository: http://sci2s.ugr.es/keel/datasets.php
- Ohio State University Financial data: http://fisher.osu.edu/fin/fdf/osudata.htm
- UC Berkeley: http://ucdata.berkeley.edu/
- UCLA: http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data
- UC Riverside Time Series: http://www.cs.ucr.edu/~eamonn/time_series_data/
- University of Toronto: http://www.cs.toronto.edu/~delve/data/datasets.html


雷达卡




京公网安备 11010802022788号







