¿Qué nuevos conjuntos de datos publicos relacionados con el COVID-19 hay disponibles?

(What new public datasets are available related to covid-19?)


Primeras 5 respuestas:

  1. 1-point-3-acres.com and the Johns Hopkins COVID-19 data repository.

  2. We used the open dataset of 2019-nCoV provided by the Johns Hopkins University;

  3. Recently, the COVID-19 Open Research Dataset (CORD-19) was published.

  4. COVID-CT, which contains 349 COVID-19 CT images from 216 patients and 463 non-COVID-19 CTs.

  5. We generated a comprehensive structured dataset of government interventions and their respective timelines of implementation.



1-point-3-acres.com and the Johns Hopkins COVID-19 data repository.

... extracted COVID-19 daily new cases and deaths in the USA from two population-based datasets, namely 1-point-3-acres.com and the Johns Hopkins COVID-19 data repository. The internet search-interest of COVID-19-related terms was obtained using Google Trends. The Pearson correlation test ...

Ref: Trends and Prediction in Daily New Cases and Deaths of COVID-19 in the United States: An Internet Search-Interest Based Model [Explor Res Hypothesis Med, 2020-04-18]


We used the open dataset of 2019-nCoV provided by the Johns Hopkins University;

... We used the open dataset of 2019-nCoV provided by the Johns Hopkins University; they made an exceptional dashboard using the affected cases data to date. 11 Apart from this, they also provide an opportunity for data analyst and researcher by providing the data ...

Ref: Analyzing the epidemiological outbreak of COVID‐19: A visual exploratory data analysis approach [J Med Virol, 2020-03-11]


Recently, the COVID-19 Open Research Dataset (CORD-19) was published.

... time of writing this article (April 2020), the world is drastically influenced by the COVID-19. Recently, the COVID-19 Open Research Dataset (CORD-19) was published. For researchers on ID such as ourselves, it is of key interest to learn whether ...

Ref: Coronaviruses and people with intellectual disability: an exploratory data analysis [J Intellect Disabil Res, 2020]


COVID-CT, which contains 349 COVID-19 CT images from 216 patients and 463 non-COVID-19 CTs.

... of COVID-19 based on CTs. To address this issue, we build an open-sourced dataset -- COVID-CT, which contains 349 COVID-19 CT images from 216 patients and 463 non-COVID-19 CTs. The utility of this dataset is confirmed by a senior radiologist who has been diagnosing ...

Ref: COVID-CT-Dataset: A CT Scan Dataset about COVID-19 [J Intellect Disabil Res, 2020-03-30]


We generated a comprehensive structured dataset of government interventions and their respective timelines of implementation.

... strategy of existing public information sources, we developed a specific hierarchical coding scheme for NPIs. We generated a comprehensive structured dataset of government interventions and their respective timelines of implementation. To improve transparency and motivate collaborative validation process, information sources are shared via an open ...

Ref: A structured open dataset of government interventions in response to COVID-19 [J Intellect Disabil Res, 2020-05-08]


'RSNA Pneumonia Detection Challenge dataset' and 'COVID-19 Image Data Collection'.

... dataset: In [8] , a new dataset is proposed by merging two other public datasets: "RSNA Pneumonia Detection Challenge dataset" and "COVID-19 Image Data Collection". The new dataset, called COVIDx, is designed for a classification problem and contemplates three classes: ...

Ref: Towards an Effective and Efficient Deep Learning Model for COVID-19 Patterns Detection in X-ray Images [J Intellect Disabil Res, 2020-04-12]


The collected data includes 224 images with confirmed Covid-19, 700 images with confirmed common bacterial pneumonia, and 504 images of normal condition.

The collected data includes 224 images with confirmed Covid-19, 700 images with confirmed common bacterial pneumonia, and 504 images of normal condition. This datasets is referred to as Dataset_1.

Ref: Covid-19: automatic detection from X-ray images utilizing transfer learning with convolutional neural networks [Phys Eng Sci Med, 2020-04-03]


We used the available 117 chest X-ray images and 20 CT images (137 images in total) of COVID-19 positive cases.

... of X-ray images provided by Dr. Joseph Cohen available from a GitHub repository [14] . We used the available 117 chest X-ray images and 20 CT images (137 images in total) of COVID-19 positive cases. We also included 117 images of healthy cases of X-ray images from Kaggle Chest X-Ray ...

Ref: Automatic Detection of Coronavirus Disease (COVID-19) in X-ray and CT Images: A Machine Learning-Based Approach [Phys Eng Sci Med, 2020-04-22]


This work curates the largest available experimental dataset for SARS-CoV-2 or SARS-CoV main protease inhibitors.

... than ten years. Drug repositioning becomes one of the most feasible approaches for combating COVID-19. This work curates the largest available experimental dataset for SARS-CoV-2 or SARS-CoV main protease inhibitors. Based on this dataset, we develop validated machine learning models with relatively low root mean ...

Ref: Repositioning of 8565 Existing Drugs for COVID-19. [The journal of physical chemistry letters, 2020-06-16]


The dataset-2 contains around 500 normal, 500 pneumonia and 157 COVID-19 chest X-ray images.

... we tested our proposed model on another dataset prepared by Ozturk et al. [18] . The dataset-2 contains around 500 normal, 500 pneumonia and 157 COVID-19 chest X-ray images. This dataset contains same COVID-19 X-ray images as in our prepared dataset, however normal and ...

Ref: CoroNet: A Deep Neural Network for Detection and Diagnosis of COVID-19 from Chest X-ray Images [Comput Methods Programs Biomed, 2020-06-05]


SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

... the outbreak expands globally, which may help to accelerate the development of novel diagnostics, drugs and vaccines to stop the COVID-19 disease. AVAILABILITY AND IMPLEMENTATION: https://www.genomedetective.com/app/typingtool/cov. CONTACT: koen@emweb.be or deoliveira@ukzn.ac.za. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. ...

Ref: Genome Detective Coronavirus Typing Tool for rapid identification and characterization of novel coronavirus genomes [Bioinformatics, 2020]


Sharma et al. [5] also made a public dashboard 2 available summarising data across more than 5 million real-time tweets.

... similar data collection methodology. Lopez et al. [4] provide another Twitter dataset including the geolocated tweets. There are some further efforts on providing similar datasets from twitter [10]- [12] . Sharma et al. [5] also made a public dashboard 2 available summarising data across more than 5 million real-time tweets. ...

Ref: A First Instagram Dataset on COVID-19 [Bioinformatics, 2020-04-25]


Some open access X-ray image sets of chest are publicly available.

... date, many AI tools and radiography image datasets are private resources. The access to publicly open COVID-19-related sets of lung CT images towards conducting deep learning experiments is relatively limited. Some open access X-ray image sets of chest are publicly available. ...

Ref: Deep Learning on Chest X-ray Images to Detect and Evaluate Pneumonia Cases at the Era of COVID-19 [Bioinformatics, 2020-04-05]


They subsequently publish their dataset on GitHub and Kaggle [3] , [4] .

... divide the model into 3 class, namely Covid-19, normal, and pneumonia with accuracy reaching 88.2%. They subsequently publish their dataset on GitHub and Kaggle [3] , [4] . At the open source dataset, it used same dataset as Covid-Net model. The aim of ...

Ref: Fast and accurate detection of Covid-19-related pneumonia from chest X-ray images with novel deep learning model [Bioinformatics, 2020-05-10]


The are therefore the most repeated hashtags that appear with #coronavirus.

... intuitive examples, such as corona, covid19, covid 19, stayathome, quarantine, love, covid, virus, and instagram. The are therefore the most repeated hashtags that appear with #coronavirus. Note that this means will might miss posts that mention these concepts in other languages. ...

Ref: A First Instagram Dataset on COVID-19 [Bioinformatics, 2020-04-25]


daily case counts of COVID-19 by reporting date and Chinese province, and a de-identified line list of patients with COVID-19.

... the Modeling of Biological + Socio-technical systems website of Northeastern University. The available data include daily case counts of COVID-19 by reporting date and Chinese province, and a de-identified line list of patients with COVID-19. The line list includes geographical location (country and province), reporting date, dates of symptom onset ...

Ref: Early epidemiological analysis of the coronavirus disease 2019 outbreak based on crowdsourced data: a population-level observational study [Lancet Digit Health, 2020-02-20]


The dataset contains 5941 chest radiography images of 2839 patient cases.

... convolutional neural network for the detection of COVID-19 infection from chest radiography images open-source dataset. The dataset contains 5941 chest radiography images of 2839 patient cases. In [11] , the authors have developed an image processing technique for the detection, quantification, ...

Ref: Understanding the COVID19 Outbreak: A Comparative Data Analytics and Study [Lancet Digit Health, 2020-03-29]


We created this CORD-NER dataset 2 with comprehensive named entity annotation on the CORD-19 corpus (2020-03-13).

... to all the COVID-19 related new types without much human effort for training data annotation. We created this CORD-NER dataset 2 with comprehensive named entity annotation on the CORD-19 corpus (2020-03-13). This dataset covers 75 fine-grained named entity types. CORD-NER is automatically generated by combining the ...

Ref: Comprehensive Named Entity Recognition on CORD-19 with Distant or Weak Supervision [Lancet Digit Health, 2020-03-27]


In this paper, we created a COVID-19 3D CT dataset with 20 cases that contains 1800+ annotated slices and made it publicly available.

... they are developed on different datasets, trained in different settings, and evaluated with different metrics. In this paper, we created a COVID-19 3D CT dataset with 20 cases that contains 1800+ annotated slices and made it publicly available. To promote the development of annotation-efficient deep learning methods, we built three benchmarks for lung ...

Ref: Towards Efficient COVID-19 CT Annotation: A Benchmark for Lung and Infection Segmentation [Lancet Digit Health, 2020-04-27]


Current open access COVID-19 Twitter data were mainly collected by keywords,

... needs and health seeking behavior [6] , and public response to policy makers [7] etc. Current open access COVID-19 Twitter data were mainly collected by keywords, such as coronavirus, Covid-19 etc [8] , [9] , none of the them is dedicated ...

Ref: Open access institutional and news media tweet dataset for COVID-19 social science research [Lancet Digit Health, 2020-04-03]


The dataset used for this work is the Our World Dataset,

... Our prediction model is available online at https://github.com/shreshthtuli/covid-19-prediction. The dataset used for this work is the Our World Dataset, available at https://github.com/owid/covid-19-dat a/tree/master/public/data/. Few interactive graphs can be seen at https://collaboration.coraltele.com/covid/. ...

Ref: Predicting the Growth and Trend of COVID-19 Pandemic using Machine Learning and Cloud Computing [Lancet Digit Health, 2020-05-12]


We collected 22 million Twitter messages related to the COVID-19 pandemic

... disease (COVID-19) related discussions, concerns, and sentiments that emerged from tweets posted by Twitter users. We collected 22 million Twitter messages related to the COVID-19 pandemic using a list of 25 hashtags such as"coronavirus,""COVID-19,""quarantine"from March 1 to April 21 in 2020. ...

Ref: Twitter discussions and concerns about COVID-19 pandemic: Twitter data analysis using a machine learning approach [Lancet Digit Health, 2020-05-26]


The CFR (cumulative incidence of mortality vs recovery) for COVID-19 confirmed positive cases in Ontario is presented in Table 1 .

... competing risk regression model, controlling for gender, with stcrreg in Stata/SE (version 12.0, StataCorp, LLC). The CFR (cumulative incidence of mortality vs recovery) for COVID-19 confirmed positive cases in Ontario is presented in Table 1 . It can be seen that it increases exponentially with age, All rights reserved. No reuse ...

Ref: Estimates of COVID-19 case-fatality risk from individual-level data [Lancet Digit Health, 2020-04-22]


Additional data related to GDP, physician and hospital bed per 1000 patients were procured from the World Bank database.

... is being updated on a daily basis till 22nd March 2020, the date of analysis. Additional data related to GDP, physician and hospital bed per 1000 patients were procured from the World Bank database. All data were collected in a file in CSV format. Analysis was conducted in Jupyter ...

Ref: Frequency of testing for COVID 19 infection and the presence of higher number of available beds per country predict outcomes with the infection, not the GDP of the country - A descriptive statistical analysis [Lancet Digit Health, 2020-04-06]


growing daily, related to COVID-19 chatter generated from January 1st to April 4th at the time of writing.

... analyses. For this purpose, we present a large-scale curated dataset of over 152 million tweets, growing daily, related to COVID-19 chatter generated from January 1st to April 4th at the time of writing. This open dataset will allow researchers to conduct a number of research projects relating to ...

Ref: A large-scale COVID-19 Twitter chatter dataset for open scientific research -- an international collaboration [Lancet Digit Health, 2020-04-07]


Data on health care utilization and outcomes can be obtained from a variety of sources including individual and multi-institutional EHR data and claims databases.

... methodologies that support our ability to draw conclusions about the causal effects of these interventions. Data on health care utilization and outcomes can be obtained from a variety of sources including individual and multi-institutional EHR data and claims databases. Data on public health interventions are already being compiled by researchers, including national and international ...

Ref: Ideas for how informaticians can get involved with COVID-19 research [BioData Min, 2020-05-12]


the dataset presented in this paper is an examination of COVID-19-related knowledge, risk perceptions and precautionary health behavior among Nigerians.

... Abstract In response to the global call for strategic information to understand the novel coronavirus, the dataset presented in this paper is an examination of COVID-19-related knowledge, risk perceptions and precautionary health behavior among Nigerians. The data were generated during the COVID-19 lockdown in the country through a survey distributed ...

Ref: Survey data of COVID-19-related Knowledge, Risk Perceptions and Precautionary Behavior among Nigerians [Data Brief, 2020-05-08]


The dataset from March 1, 2020 to March 30, 2020 contains 30.8M tweets from 182 countries.

... related to COVID-19 to filter the Twitter stream and obtain relevant tweets about the pandemic. The dataset from March 1, 2020 to March 30, 2020 contains 30.8M tweets from 182 countries. The subset of English tweets equals 20.5M. The data collection is ongoing and will be ...

Ref: COVID-19 on Social Media: Analyzing Misinformation in Twitter Conversations [Data Brief, 2020-03-26]


Finally, we provide time-series data for the cumulative number of COVID-19 confirmed cases and related deaths, from [1].

... Finally, we provide time-series data for the cumulative number of COVID-19 confirmed cases and related deaths, from [1]. This data begins on January 22, 2020. It should be noted that epidemiological modeling efforts may want to consider the uncertainty surrounding U.S. testing [71] , on which these data ...

Ref: A County-level Dataset for Informing the United States' Response to COVID-19 [Data Brief, 2020-04-01]


As of May, 2020, the dataset contains data of over 540 000 patients from 131 countries.

... aggregating curated data from multiple sources, including official government publications, peerreviewed papers, and online reports. As of May, 2020, the dataset contains data of over 540 000 patients from 131 countries. All data are geocoded and include, where available, simple demographics, presence of comorbidity (as a ...

Ref: Sharing patient-level real-time COVID-19 data [Lancet Digit Health, 2020-05-28]


All data is collected from publicly available sources, including both local health official announcements and reliable media reports, and is integrated from 1064 distinct websites.

... in the U.S. The data covers 3169 sub-country-level regions across the North America 3 . All data is collected from publicly available sources, including both local health official announcements and reliable media reports, and is integrated from 1064 distinct websites. In addition to case information in the CovidNet, our project also includes testing locations and ...

Ref: CovidNet: To Bring Data Transparency in the Era of COVID-19 [Lancet Digit Health, 2020-05-22]


COVID chest X-ray dataset [12] and Kaggle chest X-ray pneumonia dataset [39] .

... order to compose a special COVID-19 dataset, two different publicly available datasets were combined as COVID chest X-ray dataset [12] and Kaggle chest X-ray pneumonia dataset [39] . The obtained COVIDx dataset [11] consists The main purpose of the selection of COVIDx dataset ...

Ref: COVIDiagnosis-Net: Deep Bayes-SqueezeNet based Diagnostic of the Coronavirus Disease 2019 (COVID-19) from X-Ray Images [Med Hypotheses, 2020-04-23]


The trip mode datasets contain hourly subway ridership 12 at 275 stations and hourly traffic volume 13 at 105 traffic count locations across Seoul.

... Specifically, we collect and curate two categories of mobility datasets: trip mode and trip purpose. The trip mode datasets contain hourly subway ridership 12 at 275 stations and hourly traffic volume 13 at 105 traffic count locations across Seoul. Each dataset is a representative of individual movements within the city, accounting for 7.47 million ...

Ref: COVID-19 Mobility Data Collection of Seoul, South Korea [Med Hypotheses, 2020-06-11]


the COVID-19 image data collection and the RSNA Pneumonia Detection Challenge dataset [13, 14] .

... referred to as COVIDx dataset which is an amalgamation of two open access data repositories: the COVID-19 image data collection and the RSNA Pneumonia Detection Challenge dataset [13, 14] . The COVIDx dataset consists of 16, 756 chest radiography samples with 76 radiography images for ...

Ref: CoroNet: A Deep Network Architecture for Semi-Supervised Task-Based Identification of COVID-19 from Chest X-ray Images [Med Hypotheses, 2020-04-17]


This dataset contains anonymised human lung computed tomography (CT) scans with COVID-19 related findings (CT1-CT4),

... were obtained between 1st of March, 2020 and 25th of April, 2020, and provided by municipal hospitals in Moscow, Russia. This dataset contains anonymised human lung computed tomography (CT) scans with COVID-19 related findings (CT1-CT4), as well as without such findings (CT0) ( fig.) . ...

Ref: MosMedData: Chest CT Scans With COVID-19 Related Findings Dataset [Med Hypotheses, 2020-05-13]


Methods: K-means clustering was employed on the available country-specific COVID-19 epidemiological data and the influential background characteristics.

... transmission of the disease and focus on articulation of necessary interventions in an informed manner. Methods: K-means clustering was employed on the available country-specific COVID-19 epidemiological data and the influential background characteristics. Country-specific case fatality rates and the average number of people tested positive for COVID-19 per ...

Ref: Identification of spatial variations in COVID-19 epidemiological data using K-Means clustering algorithm: a global perspective [Med Hypotheses, 2020-06-05]


Specifically, we report Type I (false pos- [24] .

... and the variability of model performance given the size of current COVID-19 chest x-ray datasets. Specifically, we report Type I (false pos- [24] . This statistical test was carried out using the method described in Dietterich [23] for a ...

Ref: Intra-model Variability in COVID-19 Classification Using Chest X-ray Images [Med Hypotheses, 2020-04-30]


the 36 CORD-19 1 Open Research Dataset [3] ,

... The list of COVID-19-related articles is created based on two main data sources: the 36 CORD-19 1 Open Research Dataset [3] , provided by the Allen Institute for AI, and the 37 LitCovid 2 collection [2] provided by the ...

Ref: BIP4COVID19: Releasing impact measures for articles relevant to COVID-19 [bioRxiv, 2020-06-06]


The public repository of data followed a Creative Commons licence for data, and MIT License for Code,

... amidst the COVID-19 outbreak and how this data should be disseminated within a public dashboard. The public repository of data followed a Creative Commons licence for data, and MIT License for Code, with copyright for the Data Science for Social Impact research group at the University of ...

Ref: Use of Available Data To Inform The COVID-19 Outbreak in South Africa: A Case Study [bioRxiv, 2020-04-02]


disease related data from official public health organizations, demographic data, mobility data, and user geneated data from social media),

... artificial intelligence (AI) and leveraging the large-scale and real-time data generated from heterogeneous sources (e.g., disease related data from official public health organizations, demographic data, mobility data, and user geneated data from social media), in this work, we propose and develop an AI-driven system (named $\alpha$-Satellite}, as an initial ...

Ref: $\alpha$-Satellite: An AI-driven System and Benchmark Datasets for Hierarchical Community-level Risk Assessment to Help Combat COVID-19 [bioRxiv, 2020-03-27]


zip code level data would further 237 .

... of resources 235 compared to the widely used state level model from IHME 236 (https://covid19.healthdata.org/united-states-of-america), zip code level data would further 237 . CC-BY 4.0 International license It is made available under a is the author/funder, who has ...

Ref: CovidCounties - an interactive, real-time tracker of the COVID-19 pandemic at the level of US counties [medRxiv, 2020-05-02]


computed tomography (CT) and chest X-ray (CXR) imaging;

... desirable. Recently, with the release of publicly available datasets of corona positive patients comprising of computed tomography (CT) and chest X-ray (CXR) imaging; scientists, researchers and healthcare experts are contributing for faster and automated diagnosis of COVID-19 by ...

Ref: Automated diagnosis of COVID-19 with limited posteroanterior chest X-ray images using fine-tuned deep neural networks [medRxiv, 2020-04-23]


It contains country and statewise daily new cases, recovered and death data of COVID-19.

... the COVID-19 time-series dataset from the GitHub repository CSSEGISandData/COVID-19 (Dong et al., 2020) , maintained by the amazing team at Johns Hopkins University Center for Systems Science and Engineering (CSSE). It contains country and statewise daily new cases, recovered and death data of COVID-19. ...

Ref: COVID-19: Social Media Sentiment Analysis on Reopening [medRxiv, 2020-06-01]


This dataset contains 349 samples of COVID-19 positive and 397 COVID-19 negative CT scans,

... the proposed LA-DNN model on the public dataset collected by He et al. [6] . This dataset contains 349 samples of COVID-19 positive and 397 COVID-19 negative CT scans, which are collected from 760 preprints about COVID-19 from medRxiv and bioRxiv, posted from January ...

Ref: Online COVID-19 diagnosis with chest CT images: Lesion-attention deep neural networks [medRxiv, 2020-05-14]


Another initiative is the release of the COVID-19 Open Research Dataset (CORD-19) [2] .

... Another initiative is the release of the COVID-19 Open Research Dataset (CORD-19) [2] . CORD-19 is a growing, weekly-updated dataset of COVID-19 publications, capturing new as well as past research on "COVID-19 and the coronavirus family of viruses for use by the global research ...

Ref: A scientometric overview of CORD-19 [bioRxiv, 2020-04-20]


National register of hospital discharge diagnoses with specific ICD-10 codes related to respiratory infections are available with a one-year time lag,

... National register of hospital discharge diagnoses with specific ICD-10 codes related to respiratory infections are available with a one-year time lag, which precludes its use for SARI surveillance problematic. Efforts are underway to improve timeliness, which could make these data potentially valuable for SARI surveillance. In Germany, weekly SARI surveillance was ...

Ref: Experience of establishing severe acute respiratory surveillance in the Netherlands: evaluation and challenges [Public Health in Practice, 2020-05-30]


We have gathered data between January 5 and March 30 2020 ( §III).

... paper introduces a COVID- 19 Instagram dataset, which we make available for the research community. We have gathered data between January 5 and March 30 2020 ( §III). The dataset covers 18.5K comments and 329K likes from 5.3K posts. These posts have been ...

Ref: A First Instagram Dataset on COVID-19 [Public Health in Practice, 2020-04-25]


collecting COVID-19 and other chest pneumonia X-ray images from two different publically available databases.

... on Xception architecture pre-trained on ImageNet dataset and trained end-to-end on a dataset prepared by collecting COVID-19 and other chest pneumonia X-ray images from two different publically available databases. RESULTS: CoroNet has been trained and tested on the prepared dataset and the experimental results ...

Ref: CoroNet: A deep neural network for detection and diagnosis of COVID-19 from chest x-ray images [Comput Methods Programs Biomed, 2020]


The amount of data produced from the dawn of humankind through 2003 is generated today within a few minutes.

... outbreak, however, the COVID-19 emergency is occurring in a much more digitized and connected world. The amount of data produced from the dawn of humankind through 2003 is generated today within a few minutes. Furthermore, advanced computational models, such as those based on machine learning, have shown great potential ...

Ref: On the responsible use of digital data to tackle the COVID-19 pandemic [Nat Med, 2020-03-27]


We built the MFDD, RMFRD and SMFRD datasets,

... We built the MFDD, RMFRD and SMFRD datasets, and developed a state-of-the-art algorithm based on these datasets. The algorithm will serve the applications of contactless face authentication in community access, campus management, and enterprise resumption scenarios. Our research ...

Ref: Masked Face Recognition Dataset and Application [Nat Med, 2020-03-20]


This dataset comprises (i) sociodemographic characteristics, compiled from 35 datasets obtained at UN Data;

... Understanding the COVID-19 pandemic is a multidisciplinary effort that requires a significant number of variables. This dataset comprises (i) sociodemographic characteristics, compiled from 35 datasets obtained at UN Data; (ii) mobility metrics that can assist the analysis of social distancing, from Google Community Mobility ...

Ref: Dataset for country profile and mobility analysis in the assessment of COVID-19 pandemic [Data Brief, 2020]


Periodic versioned snapshots were released as 'Covid19Kerala.info-Data' 29 .

... The datasets are provided with the schema definition and an actionable data-package declaration 28 . Periodic versioned snapshots were released as "Covid19Kerala.info-Data" 29 . CODD-K manages the longevity and stewardship of the data. Sufficient documentation is provided to increase ...

Ref: A citizen science initiative for open data and visualization of COVID-19 outbreak in Kerala, India [Data Brief, 2020-05-18]


a new COVID-19 machine readable dataset known as COVID-19 Open Research Dataset (CORD-19) has been released.

... need for having a knowledge repository for the disease became crucial. To address this issue, a new COVID-19 machine readable dataset known as COVID-19 Open Research Dataset (CORD-19) has been released. Based on this, our objective was to build a computable co-occurrence network embeddings to assist ...

Ref: Constructing Co-occurrence Network Embeddings to Assist Association Extraction for COVID-19 and Other Coronavirus Infectious Diseases. [Journal of the American Medical Informatics Association : JAMIA, 2020-05-27]


At the time of the data collection for this paper on February 20, 2020, over 72 thousand cases have been recorded in China,

... The coronavirus disease COVID-19 started in December 2019 in Wuhan, the capital of Hubei, China. At the time of the data collection for this paper on February 20, 2020, over 72 thousand cases have been recorded in China, including over 1,870 deaths, and around 700 people, mostly travellers, were diagnosed in the rest ...

Ref: Advertisers Jump on Coronavirus Bandwagon: Politics, News, and Business [Journal of the American Medical Informatics Association : JAMIA, 2020-03-02]


The respective classes are annotated with Y for yes, N for no and U for unknown.

... provide publicly accessible medical imaging data of COVID-19 cases. The types of data are classified by CT, X-Ray, magnetic resonance tomography (MRT), metadata of the corresponding patient/case, and case review. The respective classes are annotated with Y for yes, N for no and U for unknown. ...

Ref: COVID-19: A Survey on Public Medical Imaging Data Resources [Journal of the American Medical Informatics Association : JAMIA, 2020-04-08]


Recently, the COVID‐19 Open Research Dataset (CORD‐19) was published.

... time of writing this article (April 2020), the world is drastically influenced by the COVID‐19. Recently, the COVID‐19 Open Research Dataset (CORD‐19) was published. For researchers on ID such as ourselves, it is of key interest to learn whether ...

Ref: Coronaviruses and people with intellectual disability: an exploratory data analysis [J Intellect Disabil Res, 2020-04-27]


COVID-19-Merging, COVID-19-inside-Hubei and COVID-19-outside-Hubei.

... the accuracy percentage using different datasets with different sizes. Three datasets are used, which are COVID-19-Merging, COVID-19-inside-Hubei and COVID-19-outside-Hubei. As illustrated in figure 9 , the percentage of the query answers range between 98% ...

Ref: A Multi-Dimensional Big Data Storing System for Generated COVID-19 Large-Scale Data using Apache Spark [J Intellect Disabil Res, 2020-04-30]


daily number of new internationally exported cases (or lack thereof), by date of onset, as of Jan 26, 2020;

... in Wuhan and internationally exported cases from Wuhan. The four datasets we fitted to were: daily number of new internationally exported cases (or lack thereof), by date of onset, as of Jan 26, 2020; daily number of new cases in Wuhan with no market exposure, by date of onset, ...

Ref: Early dynamics of transmission and control of COVID-19: a mathematical modelling study [Lancet Infect Dis, 2020]


In total, we have obtained 135 fake news articles, 1,568 true news articles, 27 fake claims and 166 true claims.

... After we obtained all URLs to true and fake news related to COVID-19, we used the Newspaper3k 29 to fetch their corresponding title, content, abstract, and keywords. In total, we have obtained 135 fake news articles, 1,568 true news articles, 27 fake claims and 166 true claims. ...

Ref: CoAID: COVID-19 Healthcare Misinformation Dataset [Lancet Infect Dis, 2020-05-22]


The top 15 keywords and their frequencies for the three stages of the public's attention to the COVID-19 epidemic are shown in Table 3 .

... The top 15 keywords and their frequencies for the three stages of the public"s attention to the COVID-19 epidemic are shown in Table 3 . We can find that "Wuhan", "case" and "pneumonia" always appear in three periods as hot keywords, and the remaining keywords in the different periods are slightly different. In stage A, ...

Ref: Chinese Public Attention to COVID-19 Epidemic: Based on Social Media [Lancet Infect Dis, 2020-03-20]


From two public datasets, 1248 CXR images were obtained,

... COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy on chest X-ray (CXR) images. Materials and Methods: From two public datasets, 1248 CXR images were obtained, which included 215, 533, and 500 CXR images of COVID-19 pneumonia patients, non-COVID-19 pneumonia patients, ...

Ref: Automatic classification between COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy on chest X-ray image: combination of data augmentation methods [Lancet Infect Dis, 2020-06-01]


growing daily,

... analyses. For this purpose, we present a large-scale curated dataset of over 152 million tweets, growing daily, related to COVID-19 chatter generated from January 1st to April 4th at the time of ...

Ref: A large-scale COVID-19 Twitter chatter dataset for open scientific research -- an international collaboration [Lancet Infect Dis, 2020-04-07]


This dataset is extracted from the NIH Chest Xray dataset [21] including 112, 120 X-ray images for 14 thorax abnormalities.

... X-ray images, consisting of 94, 323 frontal view chest X-ray images for common thorax diseases. This dataset is extracted from the NIH Chest Xray dataset [21] including 112, 120 X-ray images for 14 thorax abnormalities. From existing 15 diseases in this dataset, 5 classes were constructed with the help of ...

Ref: COVID-CAPS: A Capsule Network-based Framework for Identification of COVID-19 cases from X-ray Images [Lancet Infect Dis, 2020-04-06]


The publicly available datasets on confirmed COVID-19 v cases and deaths provide a key opportunity to better understand the drivers of the pandemic.

... The publicly available datasets on confirmed COVID-19 v cases and deaths provide a key opportunity to better understand the drivers of the pandemic. Research using these datasets has been growing at a very fast pace (see an indicative list of references in supplementary material 1). However, little attention has been paid to the ...

Ref: The Challenge of Using Epidemiological Case Count Data: The Example of Confirmed COVID-19 Cases and the Weather [Lancet Infect Dis, 2020-05-23]


COVID-19 image data collection [19] , and COVID-19 X-rays [20] .

... were adopted from three publicly available X-ray datasets: RSNA Pneumonia Detection Challenge dataset [18] , COVID-19 image data collection [19] , and COVID-19 X-rays [20] . These datasets are open source and fully accessible to the research community. The first dataset ...

Ref: Multi-Channel Transfer Learning of Chest X-ray Images for Screening of COVID-19 [Lancet Infect Dis, 2020-05-12]


The publicly available dataset Worldometer

... parameters of COVID-19 and SARS-CoV infectious diseases in terms of incidence, mortality, and recovery rates. The publicly available dataset Worldometer (extracted on April 5, 2020) confirmed by WHO report was available for meta-analysis purposes using ...

Ref: Comparative Global Epidemiological Investigation of SARS-CoV-2 and SARS-CoV Diseases Using Meta-MUMS Tool Through Incidence, Mortality, and Recovery Rates [Arch. med. res, 2020]


We validated our networks on 31 cases of COVID-19, 4420 cases of pneumonia, and 6851 normal cases.

... We validated our networks on 31 cases of COVID-19, 4420 cases of pneumonia, and 6851 normal cases. The reason our training data was less than the validation data is that we had a few cases of COVID-19 among many normal and pneumonia cases. Therefore, we could not ...

Ref: A modified deep convolutional neural network for detecting COVID-19 and pneumonia from chest X-ray images based on the concatenation of Xception and ResNet50V2 [Inform Med Unlocked, 2020-05-26]


In this paper, we present GeoCoV19, a large-scale Twitter dataset about the COVID-19 pandemic.

... Twitter provides timely access to health-related data about chronic disease, outbreaks, and epidemics [4] . In this paper, we present GeoCoV19, a large-scale Twitter dataset about the COVID-19 pandemic. Coronavirus disease 2019 or COVID-19 is an infectious disease that was first identified in December ...

Ref: GeoCoV19: A Dataset of Hundreds of Millions of Multilingual COVID-19 Tweets with Location Information [Inform Med Unlocked, 2020-05-22]


The dataset used for this work includes 100 chest Xray images acquired on 70 subjects,

... The dataset used for this work includes 100 chest Xray images acquired on 70 subjects, all of which were confirmed with COVID-19, and 1431 chest X-ray images diagnosed as pneumonia (not COVID-19) from 1008 subjects. The COVID-19 cases are available at the Github repository 2 ...

Ref: COVID-19 Screening on Chest X-ray Images Using Deep Learning based Anomaly Detection [Inform Med Unlocked, 2020-03-27]


We have collected and processed over 100 million tweets related to Coronavirus (focused on USA) which is about 700GB data in size.

... and to model the public emotions, we have started collecting tweets from 5th March 2020. We have collected and processed over 100 million tweets related to Coronavirus (focused on USA) which is about 700GB data in size. We understand that processing this huge amount of data in real-time requires a substantial amount ...

Ref: CoronaVis: A Real-time COVID-19 Tweets Analyzer [Inform Med Unlocked, 2020-04-29]


The available data include daily case counts of COVID-19 by reporting date and Chinese province,

... on the Laboratory for the Modeling of Biological + Socio-technical systems website of Northeastern University. The available data include daily case counts of COVID-19 by reporting date and Chinese province, and a de-identified line list of patients with COVID-19. The line list includes geographical location ...

Ref: Early epidemiological analysis of the coronavirus disease 2019 outbreak based on crowdsourced data: a population-level observational study [Lancet Digit Health, 2020-02-20]


we first annotated left lung, right lung and infections of 20 COVID-19 3D CT scans,

... the above problems by providing a well-labelled COVID-19 CT dataset and a benchmark. In particular, we first annotated left lung, right lung and infections of 20 COVID-19 3D CT scans, and then established three tasks to benchmark different deep learning strategies with limited training cases. ...

Ref: Towards Efficient COVID-19 CT Annotation: A Benchmark for Lung and Infection Segmentation [Lancet Digit Health, 2020-04-27]


GT data were then compared with daily data on COVID-19 cases that were obtained from the Taiwan Centers for Disease Control's website.

... disease transmission. Data on relative search volumes (RSVs) were filtered by geographic region in Taiwan. GT data were then compared with daily data on COVID-19 cases that were obtained from the Taiwan Centers for Disease Control"s website. Moving averages with an interval of three days of GT queries and number of COVID-19 ...

Ref: Applications of Google Search Trends for risk communication in infectious disease management: A case study of the COVID-19 outbreak in Taiwan [Int J Infect Dis, 2020-03-12]


1) molecular fingerprints to aid similarity searches,

... available 23 datasets collected from community sources representing over 4.2 B molecules enriched with pre-computed: 1) molecular fingerprints to aid similarity searches, 2) 2D images of molecules to enable exploration and application of image-based deep learning methods, ...

Ref: Targeting SARS-CoV-2 with AI- and HPC-enabled Lead Generation: A First Data Release [Int J Infect Dis, 2020-05-28]


we managed to collect the Covid Radiographic images Data-set for AI (CORDA),

... in Turin in the last days of March (at the peak of epidemic in Italy), we managed to collect the Covid Radiographic images Data-set for AI (CORDA), currently comprising images from 386 Patients that underwent COVID screening. The data are still limited ...

Ref: Unveiling COVID-19 from Chest X-ray with deep learning: a hurdles race with small data [Int J Infect Dis, 2020-04-11]


Those cities with over 50 confirmed cases monthly were taken as a discovery dataset to exclude the confounding effect due to purely imported cases.

... day between January 20 and March 11 for 430 cities and districts all over China. Those cities with over 50 confirmed cases monthly were taken as a discovery dataset to exclude the confounding effect due to purely imported cases. Four time points delay of the weather conditions from the day of epidemic situation evaluation ...

Ref: Roles of meteorological conditions in COVID-19 transmission on a worldwide scale [Int J Infect Dis, 2020-03-20]


$normal$, $other\_disease$, $pneumonia$ and $Covid-19$.

... a related large chest X-Ray dataset that is tuned for classifying between four classes viz. $normal$, $other\_disease$, $pneumonia$ and $Covid-19$. A 5-fold cross validation is performed to estimate the feasibility of using chest X-Rays to ...

Ref: Deep Learning for Screening COVID-19 using Chest X-Ray Images [Int J Infect Dis, 2020-04-22]


And Chen et al. (2020a) published the first public COVID-19 Twitter dataset.

... available all its data on patents in what it calls the Human Coronavirus Innovation Landscape Patent and Research Works Open Datasets to support the search for new and repurposed drugs. And Chen et al. (2020a) published the first public COVID-19 Twitter dataset. ...

Ref: Artificial intelligence vs COVID-19: limitations, constraints and pitfalls [AI Soc, 2020-04-28]


The CT image dataset contains 746 public chest CT images of COVID-19 patients collected from over 760 preprints,

... (LA-DNN) to predict COVID-19 positive or negative with a richly annotated chest CT image dataset. The CT image dataset contains 746 public chest CT images of COVID-19 patients collected from over 760 preprints, and the data annotations are accompanied with the textual radiology reports. We extract two types ...

Ref: Online COVID-19 diagnosis with chest CT images: Lesion-attention deep neural networks [AI Soc, 2020-05-14]


Data on relative search volumes (RSVs) were filtered by geographic region in Taiwan.

... information on COVID-19 and the practice of personal hygiene in order to prevent disease transmission. Data on relative search volumes (RSVs) were filtered by geographic region in Taiwan. GT data were then compared with daily data on COVID-19 cases that were obtained from ...

Ref: Applications of Google Search Trends for risk communication in infectious disease management: A case study of the COVID-19 outbreak in Taiwan [Int J Infect Dis, 2020-03-12]


After 9 weeks, more than 130 sites have enrolled in the program and more than 4,000 records have been abstracted in the national dataset.

... sites can leverage these data for onsite, rapid quality improvement and benchmarking versus other institutions. After 9 weeks, more than 130 sites have enrolled in the program and more than 4,000 records have been abstracted in the national dataset. Additionally, the aggregate dataset will be a valuable data resource for the medical research community. ...

Ref: The American Heart Association COVID-19 CVD Registry powered by Get With The Guidelines®. [Circulation. Cardiovascular quality and outcomes, 2020-06-17]


chest X-ray images CXR (CR, DX) and computed tomography (CT) imaging of COVID-19+ patients

... paper describes BIMCV COVID-19+, a large dataset from the Valencian Region Medical ImageBank (BIMCV) containing chest X-ray images CXR (CR, DX) and computed tomography (CT) imaging of COVID-19+ patients along with their radiological findings and locations, pathologies, radiological reports (in Spanish), DICOM metadata, Polymerase ...

Ref: BIMCV COVID-19+: a large annotated dataset of RX and CT images from COVID-19 patients [Circulation. Cardiovascular quality and outcomes, 2020-06-01]


We use publicly available frontal chest X-ray images from 181 patients [7, 8] .

... We use publicly available frontal chest X-ray images from 181 patients [7, 8] . The dataset consisted of patient scans from Italy, Taiwan, China, Australia, Israel, among other locations and was labeled as positive COVID-19 detection from expert radiologists. The images were collected from ...

Ref: Deep learning COVID-19 detection bias: accuracy through artificial intelligence [Int Orthop, 2020-05-27]


we created a COVID-19 3D CT dataset with 20 cases that contains 1800+ annotated slices and made it publicly available.

... on different datasets, trained in different settings, and evaluated with different metrics. In this paper, we created a COVID-19 3D CT dataset with 20 cases that contains 1800+ annotated slices and made it publicly available. To promote the development of annotation-efficient deep learning methods, we built three benchmarks for lung ...

Ref: Towards Efficient COVID-19 CT Annotation: A Benchmark for Lung and Infection Segmentation [Int Orthop, 2020-04-27]


we generated the partial dependency plots for the odds of Death among COVID-19 patients with Age and Days from the onset of symptoms to hospitalisation.

... to inspect the marginal effect of the predictors over the mortality of patients with COVID-19, we generated the partial dependency plots for the odds of Death among COVID-19 patients with Age and Days from the onset of symptoms to hospitalisation. As shown in Figure 2 (C, D), accentuation in odds of death was found with ...

Ref: A Machine Learning Model Reveals Older Age and Delayed Hospitalization as Predictors of Mortality in Patients with COVID-19 [Int Orthop, 2020-03-30]


dataset prepared by collecting COVID-19 and other chest pneumonia X-ray images from two different publically available databases.

... model is based on Xception architecture pre-trained on ImageNet dataset and trained end-to-end on a dataset prepared by collecting COVID-19 and other chest pneumonia X-ray images from two different publically available databases. RESULTS: CoroNet has been trained and tested on the prepared dataset and the experimental results ...

Ref: CoroNet: A deep neural network for detection and diagnosis of COVID-19 from chest x-ray images [Comput Methods Programs Biomed, 2020]


newly simulated datasets, following the analysis of different univariate 'Long Short Term Memory (LSTM)' models for forecasting new cases and resulting deaths.

... factors correlated with COVID-19, after analyzing existing datasets available in "ourworldindata.org (Oxford University database)", and newly simulated datasets, following the analysis of different univariate "Long Short Term Memory (LSTM)" models for forecasting new cases and resulting deaths. The result shows that vanilla, stacked, and bidirectional LSTM models outperformed multilayer LSTM models. Besides, ...

Ref: Statistical Explorations and Univariate Timeseries Analysis on COVID-19 Datasets to Understand the Trend of Disease Spreading and Death. [Sensors, 2020-05-29]


We prepared a dataset of 5,000 images with binary labels, for COVID-19 detection from Chest X-ray images.

... • We prepared a dataset of 5,000 images with binary labels, for COVID-19 detection from Chest X-ray images. This dataset can serve as a benchmark for the research community. The images in COVID-19 class, are labeled by a board-certified radiologist, and only those with a clear sign ...

Ref: Deep-COVID: Predicting COVID-19 From Chest X-Ray Images Using Deep Transfer Learning [Sensors, 2020-04-20]


K-means clustering was employed on the available country-specific COVID-19 epidemiological data and the influential background characteristics.

... of the disease and focus on articulation of necessary interventions in an informed manner. Methods: K-means clustering was employed on the available country-specific COVID-19 epidemiological data and the influential background characteristics. Country-specific case fatality rates and the average number of people tested positive for COVID-19 per ...

Ref: Identification of spatial variations in COVID-19 epidemiological data using K-Means clustering algorithm: a global perspective [Sensors, 2020-06-05]


On the publicly available covid-chestxray-dataset [2],

... AI Detector, a novel deep neural network based model to triage patients for appropriate testing. On the publicly available covid-chestxray-dataset [2], our model gives 90.5% accuracy with 100% sensitivity (recall) for the COVID-19 infection. We significantly ...

Ref: CovidAID: COVID-19 Detection Using Chest X-Ray [Sensors, 2020-04-21]


In particular, we first annotated left lung, right lung and infections of 20 COVID-19 3D CT scans,

... to alleviate the above problems by providing a well-labelled COVID-19 CT dataset and a benchmark. In particular, we first annotated left lung, right lung and infections of 20 COVID-19 3D CT scans, and then established three tasks to benchmark different deep learning strategies with limited training cases. ...

Ref: Towards Efficient COVID-19 CT Annotation: A Benchmark for Lung and Infection Segmentation [Sensors, 2020-04-27]


Dimensions' COVID-19 dataset and the Allen Institute for AI's CORD-19).

... topics and research resources. We apply this method on two recently released publications datasets ( Dimensions" COVID-19 dataset and the Allen Institute for AI"s CORD-19). The results reveal intriguing information including increased efforts in topics such as social distancing; cross-domain ...

Ref: Visualising COVID-19 Research [Sensors, 2020-05-13]


https://github.com/X-zhangyang/Real-World-Masked-Face-Dataset.

... applications on masked faces can be developed. The multi-granularity masked face recognition model we developed achieves 95% accuracy, exceeding the results reported by the industry. Our datasets are available at: https://github.com/X-zhangyang/Real-World-Masked-Face-Dataset. ...

Ref: Masked Face Recognition Dataset and Application [Sensors, 2020-03-20]


There are scientific publications using chest X-ray images in the diagnosis of MERS-CoV and SARS-CoV.

... MERS-CoV and SARS-CoV are expressed as cousins of COVID-19. There are scientific publications using chest X-ray images in the diagnosis of MERS-CoV and SARS-CoV. In the study of Ahmet Hamimi about MERS CoV showed that there are features in the chest X-ray and CT that ...

Ref: Automatic Detection of Coronavirus Disease (COVID-19) Using X-ray Images and Deep Convolutional Neural Networks [Sensors, 2020-03-24]


The most reported predictors of presence of covid-19 in patients with suspected disease included age, body temperature, and signs and symptoms.

... or length of hospital stay. Only one study used patient data from outside of China. The most reported predictors of presence of covid-19 in patients with suspected disease included age, body temperature, and signs and symptoms. The most reported predictors of severe prognosis in patients with covid-19 included age, sex, features ...

Ref: Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal [BMJ, 2020-04-07]


COVID19 Global Forecasting (Week 2)' dataset [6] that was released by the Kaggle 1 platform.

... In this study, we have incorporated the " COVID19 Global Forecasting (Week 2)" dataset [6] that was released by the Kaggle 1 platform. The dataset includes daily updates of the COVID-19 confirmed cases and mortality rates for 173 countries reported by WHO between 22 January ...

Ref: Understanding Economic and Health Factors Impacting the Spread of COVID-19 Disease [BMJ, 2020-04-11]


This application is accompanied by a manually curated dataset that catalogs all major public policy actions made at the state-level,

... daily disease trends at the level of US counties using time series plots and maps. This application is accompanied by a manually curated dataset that catalogs all major public policy actions made at the state-level, as well as technical validation of the primary data. Finally, the underlying code for the ...

Ref: CovidCounties - an interactive, real-time tracker of the COVID-19 pandemic at the level of US counties [medRxiv, 2020-05-02]


The cumulative data set contains 190 COVID-19 images, 1345 viral pneumonia images, and 1341 normal chest x-ray images.

... dataset, images from recently published articles, and a data set hosted at Kaggle 15 . The cumulative data set contains 190 COVID-19 images, 1345 viral pneumonia images, and 1341 normal chest x-ray images. The authors further created 2500 augmented images from each category for the training and validation ...

Ref: COVID-19 Datasets: A Survey and Future Challenges [medRxiv, 2020-05-26]


The authors developed a dataset with many of the socioeconomic, demographic, travel, and health care features likely to impact COVID-19 mortality.

... account various epidemiologic factors of disease spread and more recently some of the mitigation measures. The authors developed a dataset with many of the socioeconomic, demographic, travel, and health care features likely to impact COVID-19 mortality. The dataset was compiled using 20 variables for each of the fifty states in the ...

Ref: Explainable machine learning models to understand determinants of COVID-19 mortality in the United States [medRxiv, 2020-05-26]


A total of salient 11 topics are identified and then categorized into 10 themes,

... 1.8 million Tweets messages related to coronavirus collected from January 20th to March 7th, 2020. A total of salient 11 topics are identified and then categorized into 10 themes, such as"cases outside China (worldwide),""COVID-19 outbreak in South Korea,""early signs of the outbreak in New ...

Ref: Machine learning on Big Data from Twitter to understand public reactions to COVID-19 [medRxiv, 2020-05-18]