Tools for exploring the scientific literature

2021-08-10

There are millions of scientific publications, and many people have created tools for exploring them. In this post, we’ll highlight a few websites and tools that allow us to search for text inside figures, view citation networks, share public comments, and more.

Would you like to share another resource I missed? Let me know, and I might consider adding to this list.


Academic#

EVEX: search for bio-molecular events across millions of articles#

http://www.evexdb.org/

EVEX is a text mining resource built on top of PubMed abstracts and PubMed Central full texts. It contains over 40 million bio-molecular events among more than 76 million automatically extracted gene/protein name mentions. The text mining data further has been enriched with gene normalization results, allowing straightforward integration with external resources. Further, gene families from Ensembl and HomoloGene provide homology-based event generalizations. EVEX presents both direct and indirect associations between genes and proteins, enabling explorative browsing of relevant literature.


JANE: Journal/Author Name Estimator#

https://jane.biosemantics.org/

Have you recently written a paper, but you’re not sure to which journal you should submit it? Or maybe you want to find relevant articles to cite in your paper? Or are you an editor, and do you need to find reviewers for a particular paper? Jane can help!


PubTator: a web-based text mining tool for assisting biocuration#

https://www.ncbi.nlm.nih.gov/research/pubtator/


Viziometrics: search for text inside figures#

http://www.viziometrics.org/?keywords=PD-L1&art_view=false

In this project, we use techniques from computer vision and machine learning to classify more than 8 million figures from PubMed into 5 figure types and study the resulting patterns of visual information as they relate to impact. We find that the distribution of figures and figure types in the literature has remained relatively constant over time, but can vary widely across field and topic. We find a significant correlation between scientific impact and the use of visual information, where higher impact papers tend to include more diagrams, and to a lesser extent more plots and photographs.

A project of the eScience Institue at the University of Washington.


Publishing platform#

Peeriodicals: your own curated collections of publications#

https://peeriodicals.com/

A peeriodical is a lightweight virtual journal with you as the Editor-in-chief, giving you complete freedom in setting editorial policy to select the most interesting and useful manuscripts for your readers. The manuscripts you will evaluate and select are existing publications—preprints and papers. Thus, a peeriodical replicates all the functions of a traditional journal, including discovery, selection and certification, except publication itself.


PubPub: a nonprofit alternative to existing publishing models and tools#

https://www.pubpub.org/

PubPub gives research communities of all stripes and sizes a simple, affordable, and nonprofit alternative to existing publishing models and tools.

As part of the Knowledge Futures Group, we’re committed to making PubPub open and easily accessible to a wide range of groups. That means we’re committed to providing a free version of PubPub forever, releasing open-source code, and operating under non-profit, sustainable, researcher-friendly business models.


Citation network#

CoCites: a citation-based method for searching scientific literature#

https://cocites.com/

Google Chrome extension

Co-cited is the co-citation frequency, indicating how many articles cite the article together with the query article. Similarity is the co-citation as percentage of the times cited of the query article or the article in the search results, whichever is the lowest. These numbers are calculated for the last 100 citations when articles are cited more than 100 times.

Co-citations is the frequency with which two articles are cited together in the reference lists of other articles.

Across many reference lists, we find that some articles are co-cited more frequently than others. Articles that are frequently cited together tend to be on a similar topic.

CoCites retrieves articles that cite an article of interest (the ‘query article’) and extracts all titles in their reference lists. CoCites counts how often each title appears in all reference lists and ranks them in descending order.


Connected Papers: view a network of similar papers#

https://www.connectedpapers.com

Connected papers is a unique, visual tool to help researchers and applied scientists find and explore papers relevant to their field of work. To create each graph, we analyze an order of ~50,000 papers and select the few dozen with the strongest connections to the origin paper. In the graph, papers are arranged according to their similarity. That means that even papers that do not directly cite each other can be strongly connected and very closely positioned.

Connected Papers is not a citation tree. Our similarity metric is based on the concepts of Co-citation and Bibliographic Coupling. According to this measure, two papers that have highly overlapping citations and references are presumed to have a higher chance of treating a related subject matter.


SurVis: visual literature browser#

http://dynamicgraphs.fbeck.com/

Source code: https://github.com/fabian-beck/survis

SurVis is a flexible online browser to present and analyze scientific literature. The system is made for authors of survey articles, theses, or books who want to share their references in a user-friendly way. All you need to start is a bib file and a list of keywords for your papers.


Stateoftheart AI: visualize the evolution of models#

https://www.stateoftheart.ai

We present a platform that maps and visualizes the evolution of models, a taxonomy of tasks and datasets, a wiki of concepts, a network of research collaborations, and the citation graph of papers – together with certain interactions between them. There is more to come! Additional features and functionality are already in development.


Not-for-profit#

Crossref: metadata for scholarly works#

https://www.crossref.org

Crossref makes research outputs easy to find, cite, link, assess, and reuse.

We’re a not-for-profit membership organization that exists to make scholarly communications better. We rally the community; tag and share metadata; run an open infrastructure; play with technology; and make tools and services—all to help put scholarly content in context.

We offer a wide array of services to ensure that scholarly research metadata is registered, linked, and distributed. When members register their content with us, we collect both bibliographic and non-bibliographic metadata. We process it so that connections can be made between publications, people, organizations, and other associated outputs. We preserve the metadata we receive for the scholarly record. We also make it available across a range of interfaces and formats so that the community can use it and build tools with it.

Learn more about the Python, Ruby, R, and Javascript libraries for accessing Crossref:


Internet Archive Scholar#

https://scholar.archive.org/

Search Millions of Research Papers This fulltext search index includes over 25 million research articles and other scholarly documents preserved in the Internet Archive. The collection spans from digitized copies of eighteenth century journals through the latest Open Access conference proceedings and pre-prints crawled from the World Wide Web.


OpenCitations: citation data for scholarly works#

https://opencitations.net/

OpenCitations is an independent not-for-profit infrastructure organization for open scholarship dedicated to the publication of open bibliographic and citation data by the use of Semantic Web (Linked Data) technologies. It is also engaged in advocacy for open citations, particularly in its role as a key founding member of the Initiative for Open Citations (I4OC). For administrative convenience, OpenCitations is managed by the Research Centre for Open Scholarly Metadata at the University of Bologna.

OpenCitations offers data for download, an API to query the data, and software for working with publication data.


PubMed: biomedical literature from MEDLINE, life science journals, and online books#

https://pubmed.ncbi.nlm.nih.gov/

PubMed® comprises more than 32 million citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full text content from PubMed Central and publisher web sites.


PubPeer: share comments on publications#

https://pubpeer.com/

The PubPeer Foundation is a California-registered public-benefit corporation with 501(c)(3) nonprofit status in the United States. The overarching goal of the Foundation is to improve the quality of scientific research by enabling innovative approaches for community interaction.

In the PubPeer FAQ, there is guidance for how to write comments:

By far the most important rule for commenting is to base your statements on publicly verifiable information. This will usually be the data published in the paper you are commenting on, but could also be another paper or some other source such as a book, newspaper or website. If possible, please include supporting information in the comment and you must at least cite your sources.


Semantic Scholar: an AI-powered research tool for scientific literature#

https://www.semanticscholar.org/

Our mission is to accelerate scientific breakthroughs by using AI to help scholars locate and understand the right research, make important connections, and overcome information overload.

Semantic Scholar is created by the Allen Institute for AI

AI2 is a non-profit research institute founded in 2014 with the mission of conducting high-impact AI research and engineering in service of the common good. AI2 is the creation of Paul Allen, Microsoft co-founder, and is led by Dr. Oren Etzioni, a leading AI researcher.


Sherpa Romeo: find open access policies from each publisher#

https://v2.sherpa.ac.uk/romeo/

Sherpa Romeo is an online resource that aggregates and analyses publisher open access policies from around the world and provides summaries of publisher copyright and open access archiving policies on a journal-by-journal basis.

Jisc is a UK not-for-profit organization that operates and funds Sherpa Romeo to serve the needs of the open access community.


Startup#

Litmaps: interactive citation networks and email updates#

https://app.litmaps.co/

Litmaps combines interactive citation maps, modern search tools, and regular email updates, to create the best research discovery experience ever.


scite: distinguish supporting or contrasting citations#

https://scite.ai/reports/fast-sensitive-and-accurate-integration-jXaGNr5

scite is a Brooklyn-based startup that helps researchers better discover and evaluate scientific articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contradicting evidence.

The most efficient way to discover and understand research. Using Smart Citations, easily check how a scientific article has been cited and if its findings have been supported or contrasted by others. World’s largest source of citation statements. 890m citation statements extracted and analyzed from over 26m full-text articles.


Corporate#

Dimensions: a comprehensive collection of linked grants, publications, datasets and more#

https://app.dimensions.ai/

Together, we have created a database that offers the most comprehensive collection of linked data in a single platform; from grants, publications, datasets and clinical trials to patents and policy documents. Because Dimensions maps the entire research lifecycle, you can follow research from funding through output to impact. It has transformed the way research is discovered, accessed and evaluated.

Dimensions is one of many products by Digital Science & Research Solutions, Inc..


Elsevier Scopus: curated and linked scholarly literature#

https://www.scopus.com/

Scopus uniquely combines a comprehensive, expertly curated abstract and citation database with enriched data and linked scholarly literature across a wide variety of disciplines.

One of many products from Elsevier.


Google Scholar#

https://scholar.google.com/

Profile, Library, Alerts, Metrics


Google BioMed Explorer: submit questions and get answers#

https://sites.research.google/biomedexplorer/

Answers to biomedical questions

Based on PubMed, PubMedCentral, and CORD-19. Courtesy of the U.S. National Library of Medicine.


Lens: search, analyze and manage patent and scholarly data#

https://lens.org

Lens serves global patent and scholarly knowledge as a public good to inform science and technology enabled problem solving. No account required.


Manuscript Matcher: find an appropriate journal for your work#

https://mjl.clarivate.com/manuscript-matcher

Find relevant, reputable journals for potential publication of your research based on an analysis of tens of millions of citation connections in Web of Science Core Collection using Manuscript Matcher.

Submit your title and abstract, and get a list of journal names:


Microsoft Academic#

https://academic.microsoft.com/home

Microsoft Research set out to demonstrate AI-curated knowledge can effectively assist people in making serendipitous discoveries and deriving valuable insights. After seven years of developing the machine reading technology and working with the research community, we have chosen to embrace a community-driven approach within academia and now turn our focus to exploring ways we can extend this technology to even more people and organizations.

This AI research project will be supported until the end of calendar year 2021, upon which time MAS will be retired.

OpenAlex will launch in December 2021, as a drop-in replacement for Microsoft Academic Graph. Learn more in our latest blog post, and join the mailing list to stay up-to-date.


© 2024 Kamil Slowikowski