Unified Astronomy Thesaurus (UAT) Integration in ADS Search and Discovery

Cross posted from the ADS Blog.

Alberto Accomazzi (ADS Principal Investigator), Jenny Novacescu (Space Telescope Science Institute), Katie Frey (Center for Astrophysics & UAT Curator), and Pavlos Protopapas (Harvard University)

The ADS Team is working in collaboration with Pavlos Protopapas and Ben Yuen of Harvard University to pilot the integration of the Unified Astronomy Thesaurus (UAT) into ADS search and discovery for new, future, and legacy literature. ADS users will be able to browse results using left side facets in the query results screen, or conduct an initial search using UAT terms.

An example search query in ADS. The example query is: full:”super Earth” property:refereed
Example search results from ADS, showing UAT concepts as a facet query on the left side.
The same example search query in ADS as before, but now it includes filtering on a UAT concept. The example query is: full:”super Earth” property:refereed uat:”high contrast techniques”

While it is currently possible to search ADS by keywords provided by publishers, there is no single vocabulary that has been consistently used throughout the indexed literature in ADS. The Astronomical Subject Keywords that had been in use by leading astronomy journals since the 1970s hasn’t been updated since 2013, and may not cover the latest topics in the field. The ‘Keywords’ also do not include definitions or relationships between concepts. For this reason, the American Astronomical Society (AAS) journals and the Publications of the Astronomical Society of the Pacific (PASP) elected to adopt the Unified Astronomy Thesaurus as its keyword system of choice in 2019 and 2020, respectively.  

The UAT is an open, interoperable, and community-supported project which formalizes astronomical concepts and their inter-relationships into a well-curated, freely available open resource. It reconciled divergent and isolated vocabularies from the fields of astronomy and astrophysics, such as the IAU Thesaurus, the Physics and Astronomy Classification Scheme, the Astronomy Subject Keywords, and others. The UAT’s primary mission is to support semantic enrichment of the literature, thereby enabling greater search and discovery across the astrophysics literature. In addition, the UAT is being used as a taxonomy with which to label other astronomical research products, such as software and datasets.

The ADS Team’s goal is to promote the use of UAT concepts as a standard way to describe and discover records in its astronomy collection. Ben Yuen, working under the supervision of Protopapas, is using machine learning techniques to automatically assign UAT terms to the majority of records in ADS which do not have them, such as the legacy literature. In order to produce accurate results, the system is being trained on the corpus of AAS articles which currently have UAT concepts associated with them. Validation of the results through editorial input and user feedback will be used to improve the automated process. 

This pilot program is beneficial to all of ADS, as it provides a single, up-to-date set of concepts that can be used to identify all current research topics of interest to astronomers. The team intends to extend the system to use concepts drawn from other controlled vocabularies for subject areas outside the core astronomy collection. (To learn more about ADS’ recent expansion, which encompasses Planetary Science and Heliophysics literature, and will in future include Earth Science, and Biological & Physical Sciences view our earlier blog post: https://ui.adsabs.harvard.edu/blog/arc-ssad-project). Development and testing of this prototype are ongoing. The ADS Team expects to release this capability in production by December 2023.

As ADS adopts use of the UAT across its astronomy collection, we encourage all astronomy and astrophysics publishers to use the thesaurus as its article keyword system to facilitate integration of this content into future ADS search and discovery. As more journals and research products – such as datasets, software, and proposals – are tagged with UAT terms, the ability to search, browse, and crosslink all of these resources by science topic will increase.

While the AAS has assumed formal ownership of the UAT, the thesaurus remains available under a Creative Commons License, ensuring its widest use while protecting the intellectual property of its contributors. Development and maintenance are stewarded by a broad group of parties with a direct stake in the UAT; this includes professional associations (IVOA, IAU), learned societies (AAS, RAS), publishers (IOP, AIP), software developers, librarians and other curators working for major astronomy institutes and data archives.

The UAT has been implemented by an increasing number of journals, research organizations, and systems. Current adoptees include:

  • American Astronomical Society journals, including The Astronomical Journal (AJ), The Astrophysical Journal (ApJ), ApJS, ApJL, The Planetary Science Journal (PSJ), and Research Notes of the AAS (RNAAS)
  • Astrophysics Data System (ADS)
  • Publications of the Astronomical Society of the Pacific (PASP)
  • International Virtual Observatory Alliance (IVOA)
  • Space Telescope Science Institute (STScI) for the Hubble Space Telescope (HST) and James Webb Space Telescope (JWST) proposal systems
  • WikiData
  • Icarus – in formulation for 2023

Update from the Unified Astronomy Thesaurus

This was originally posted on AAS Nova, written by Susanna Kohler , 12 February 2021.

Remember the Unified Astronomy Thesaurus? The UAT is an open, interoperable, and community-supported project that formalizes astronomical concepts and their inter-relationships into a high quality, freely available open resource. This resource can then be used to tag astronomical work — like articles, proposals, and datasets — with accurate, broadly adopted concepts.

The UAT has taken off in the year since we last reported on it! AAS journals have all moved entirely to using the more flexible and dynamic UAT in place the old, static keyword system. In addition, adoption is increasing across the broader astronomical community: the UAT has been implemented by the AAS journals, the Publications of the Astronomical Society of the Pacific, the International Virtual Observatory Alliance, the proposal system for the Hubble Space Telescope, and WikiData.

This week brings two news items from the UAT:

  1. An opportunity to join the UAT Steering Committee
  2. An update on the newest release of the UAT.

A Quick Refresher

Why is the UAT so cool? Simply put, organizing information is hard — but the UAT has provided a much-needed modern update for astronomy. Old systems of static keywords fail to capture the multidimensional nature of how concepts can relate to each other. When using the UAT to select keywords for their work, now, authors have access to a much broader range of suggestions that allow them to more accurately reflect what their work is about. 

An example: suppose I’m writing an article on spiral galaxies. If I enter this concept into the UAT, the Thesaurus knows that spiral galaxies fall under the parent concept of disk galaxies, and it also knows that Andromeda is a specific example of a spiral galaxy. What’s more, it’s aware that spiral galaxies are also referred to as S galaxies, and that the topic might come up in the related concept of the Hubble galaxy classification scheme.

Screenshot shows result of entering "spiral galaxies" into the UAT.
The UAT entry for the concept “spiral galaxies” includes broader and narrower concepts, alternate terms, related concepts, and a definition.

The relationships charted within the UAT make it much easier for me to select the concepts that best describe the article I’m writing, the UAT’s living and adaptable nature allows it to keep up with changing times, and universal adoption of the UAT will greatly simplify the organization of information across platforms.

Become a UAT Steering Committee Member!

Are you convinced that this is a cool concept? Want to help shape the future development of the UAT? The UAT Steering Committee is seeking a new member.

The Steering Committee (SC) sets the general parameters for the overall direction of the UAT and is composed of representatives from groups with a direct stake in the development and success of the Thesaurus. Members of the SC also serve as representatives of the UAT, promoting it to global astronomy and astrophysics, library, and publishing communities; developing test cases; and increasing its overall use.

The UAT currently welcomes expressions of interest in the open SC position from astronomers, researchers, librarians, and others. The commitment is a two-year term and includes monthly SC meetings. You can contact the chair of the UAT SC, Barbara Kern, with “UAT Steering Committee” in the subject line to express your interest or to ask any questions. [Note: This year’s call for interest closed February 2021 –Katie Frey]

What’s New in the Latest UAT Update

In December 2020, Version 4.0.0 of the UAT was released to the worldwide astronomical community.

UAT v.4.0.0 added nearly 50 new concepts in the areas of planetary science and exoplanets and also added definitions — largely sourced from the Etymological Dictionary of Astronomy and Astrophysics — for about 40% of all existing concepts for the first time. For examples of concept definitions, see the image above for the spiral galaxies concept or view the file for baryonic dark matter. More accompanying definitions are expected in future releases.

A number of technical updates were also implemented in v.4.0.0. Deprecated concepts can now be found in the UAT GitHub repository, and there are multiple json files to choose from if implementing the Unified Astronomy Thesaurus in your local systems. For comprehensive v.4.0.0 release notes, visit https://astrothesaurus.org/blog/.

Where to Learn More

Flowchart-style diagram shows relationship between terms when "exoplanet" is entered into the UAT sorting tool.
Example of the visualization possible using the UAT sorting tool.

Version 4.0.0 of the Unified Astronomy Thesaurus

Today, we release version 4.0.0 of the UAT!

Updates and Changes

In addition to the usual updates and additions to concepts found in the Unified Astronomy Thesaurus, this update also brings with it some minor technical changes and updates that will hopefully help developers who are interested in implementing the UAT into their tools and platforms.

The largest content change this time would be the addition of over 850 definitions to UAT concept.  A few people have been asking for these for a while, and this was finally the year to make some good headway on this.  Almost all of these initial definitions were sourced from the Etymological Dictionary of Astronomy and Astrophysics, with a few being supplemented from other sources.  Definitions for the remaining two-thirds of UAT concepts will be coming in future releases as they can be sourced and vetted.

About 50 concepts were added, while about 25 were deprecated, bringing us to a total of 2122 concepts.  The additional concepts are mostly concentrated in the “Planetary science” branch, with some spill over into exoplanets.  Alternate labels, scope notes, and examples were also added to over 100 concepts, all of which improve usage of the UAT in automated systems.

Which leads me to those technical updates I mentioned earlier.  Deprecated concepts can now be found in the UAT_deprecatedConcepts.rdf file.  More usefully, I’ve gone back through that list of deprecated concepts and added “Use instead” notes for every single one, pointing back to one or more concepts that could be used in lieu of the original concept.  These notes are found as “changeNotes” in UAT_skosnotes.rdf.

As many developers (including myself) prefer working with json, I’ve also expanded the files available in that format.

  1. UAT.json should be compatible with systems that had been using the prior version of this file.  It contains the full UAT organized into a hierarchy, but now it contains a lot more information about each concept, including definitions, other notes, and related links.  It also has a section to list all the deprecated concepts and their suggested redirects.
  2. UAT_simple.json this is an updated version of the older UAT.json file.  It only includes the concepts and their URIs in a hierarchy.  I don’t expect this file to be very useful, but it’s here if anyone needs to work with a slimmed down file.
  3. UAT_list.json would be great if you need to look up information for a specific concept and don’t want to navigate through the whole hierarchy to find it.  Similar to UAT.json, this file contains all information available about each concept, but nothing is nested, and the deprecated concepts are listed right along with the active concepts.

Presentations and Events

In addition to our usual presence at the AAS Annual meeting, the UAT was visible at a few other events this year.

Stewardship and Impact of a Thesaurus for the Astronomy Community

I gave a poster presentation at the Special Libraries Association conference over the summer.  The poster and presentation recording along with additional information from that event can be found here.

Powered by the Unified Astronomy Thesaurus

Last year, Frank Timmes recorded four short videos introducing the UAT to the astronomy community.  This summer I followed up with him and we recorded three longer format videos discussing how to use the UAT, how authors can influence the uat, and how the UAT has been used in publications so far.  All seven videos in this series can be found in this playlist on YouTube.

Unified Astronomy Thesaurus Informational Webinar

Speakers from IVOA, STScI, AAS, and ADS presented on their implementations, current and planned, of the Unified Astronomy Thesaurus.  Slides are available here, and a video recording of the session should be added shortly.

Concluding Remarks

The Steering Committee wishes to thank those from the astronomy community who took the time to contribute feedback for improving the UAT.  We also wish to thank the American Astronomical Society for continuing to support the growth of the Unified Astronomy Thesaurus, especially the editors who provided feedback for proposed changes.

Best,
Katie Frey
UAT Curator

Updates to Versioning and Release Cycle Documents

While preparing the UAT v.3.1.0 release, it became apparent that the existing versioning and release documentation did not fit the workflow for the UAT. Reflecting on the UAT releases in the three years since this guidance was originally written, we’ve produced updated documents that align better with the versioning and release cycle of the UAT.

The UAT will continue to use version numbers inspired by Semantic Versioning, but features such as tracking backwards compatibility and functionality that are of core importance to software packages do not apply to data products such as the UAT. As such, the guidelines for deciding what constitutes a major, minor, or patch release has been re-written to better reflect the actual process of releasing new versions of the UAT.

Likewise, the Release Cycle has spun off into its own document, and has been greatly expanded to include a schedule based on the real work of updating the UAT over the last few years. The hope is that this Release Cycle document can help inform authors and users of the UAT about the process of updating the UAT, while the Curator will be able to use it to help guide how suggestions are evaluated and to alert the community of upcoming changes in a timely manner.

We expect these Versioning and Release Cycle documents will be revisited and revised in the future as needed.

Status of the UAT Project

Although the website has been pretty quite, a LOT of work has gone into the UAT since I last posted an update nearly one year ago.  This is a short summary update of the status of the UAT project; expect more details to follow.

1) UAT v1.1.0 Published Online

A few weeks ago we published version 1.1.0 to Research Vocabularies Australia (RVA), a controlled vocabulary discovery service from the Australian National Data Service (ANDS).  We’ve been collaborating with ANDS for the better part of 2016 and are happy to have the UAT publicly available on their platform.

From the RVA platform, you can download the full UAT in different file formats, or use the API function to connect the UAT to your applications and websites.

 

2) UAT Steering Committee

Julie Steffen, Director of Publishing for the American Astronomical Society, has formed a Steering Committee to manage the operation and direction of the UAT.  The Committee meets regularly, about once a month, to discuss topics such as outreach, funding, development, and licensing.

More information about the Committee can be found on the “Governance” page under “About Us.”

 

3) Versioning, Patch Notes, Deltas

Alberto Accomazzi (SAO/NASA ADS) has been developing a versioning scheme based on the Semantic Versioning standards for the Unified Astronomy Thesaurus.  The existing versions found on GitHub have been renamed to follow the new scheme, and this versioning system will be used moving forward.

A defined versioning system will allow us to post useful patch notes, to describe the kinds of changes that have been made from one update to the next.  Once finalized, the versioning documentation will be made available.

Additionally, we are examining the process of providing deltas (a file containing only the changes from one version to the next) as part of the update cycle.

 

4) Contribution Tracking & GitHub Issues

A persistent issue for updating and managing the UAT over the years has been keeping track of suggestions, contribution, and the decisions regarding them. Our first temporary solution consisted of emails sent directly to me that I filed away into a folder until I was able to act on them. Unfortunately, this system was equivalent to a black hole. Information goes in, but it’s hard to tell what, if anything, is coming back out.

At a recent Steering Committee meeting, the Issues feature on GitHub was suggested as a way to manage and track the various suggestions. GitHub also has the added benefit of being an open system; anyone can see the current suggestions under discussion and create an account to make a contribution.

A few weeks ago I began the work on transferring comments to the UAT Issues tracker, and I would welcome anyone with a suggestion or idea for the UAT to add it to the list.

More documentation detailing how we will be using the Issue tracker will be forthcoming.

 

5) Sorting Tool

Over the last year, the Sorting Tool was developed by Sarah Weissman (STScI) and myself as a way to give our users a visual overview of the UAT and make suggestions directly in the hierarchy.

Although this is a very powerful tool, currently the system submits its feedback as an email directly to me, which I plan to duplicate as an Issue on GitHub.  Pushing feedback from the Sorting Tool directly into the Issue tracker is being examined.

 

6) Website Updates

Updates have been made across the UAT website, focused mainly on cleaning up the existing content and tidying up the navigational menus.

With the addition of the ANDS vocabulary server, I’ve removed the old hierarchical and alphabetical browsers, the UAT Explorer, and the UAT dendrogram view.  These tools were difficult to maintain, requiring manual creation of files and uploads to the website.  The new vocabulary server maintained by ANDS replaces most of those functionalities.

The Governance page has been updated to reflect the Steering Committee, and the Contribute pages now direct users to GitHub and the Sorting Tool.

Update on the UAT

I just realized it had been a while since my last update!  I’ve been working on the UAT behind the scenes quite a bit lately, though most of it has not been visible so I thought it would be a good time to write another update email.

We’re nearly ready to launch VocBench: After looking at various tools, we landed on VocBench, an open source platform for managing and editing controlled vocabularies.  We’ve spent the last few months getting it ready to go, and we are almost ready to make it public.
This platform will allow users to suggest edits and updates to the UAT, which are then assigned to our team of editors for review, and suggestions that are accepted will be incorporated into a future release.

Recent updates to the UAT website: I spent some time re-organizing the website to make visible some items that have previously been buried.  For example, now you will see the Contribute button right in the menu bar!  For now this button takes you to our contribution form, but once VocBench it launched, it will go there instead.
Also, the download section is no longer hidden under the Thesaurus button.  From there you can download the current RDF file as well as a new flat CSV file.  I’ve also included a link to the UAT GitHub repository, where I’ve been hard at work creating scripts to turn the RDF/SKOS file into the website browsers.

Posters and papers: Alberto Accomazzi, et al, wrote a paper about the UAT project, following a poster he presented at the ADASS XXIII conference.  Currently the paper can be found in arXiv, but it will also be published in an upcoming volume of the ASP Conference Proceedings.
Another poster on the Unified Astronomy Thesaurus was presented in June at the Libraries and Information Services in Astronomy conference, with an upcoming paper also scheduled to be published in a future volume of the ASP Conference Proceedings.

We’re still looking for volunteers to help oversee various branches of the UAT!  If you’re interested in becoming an editor, please let me know.

New Thesaurus Created for the Astronomy Community

The following press release was originally posted by IOP Publishing.

Joint news announcement from: The SAO/NASA Astrophysics Data System, the American Astronomical Society, the American Institute of Physics, the Harvard-Smithsonian Center for Astrophysics John G. Wolbach library, IOP Publishing, and the International Virtual Observatory Alliance.

New thesaurus created for the astronomy community
24 Jan 2013
Bristol, UK

The American Institute of Physics (AIP) and IOP Publishing (IOP) have jointly announced the gift of a new astronomy thesaurus called the Unified Astronomy Thesaurus (UAT) to the American Astronomical Society (AAS) that will help improve future information discovery for researchers.

The AAS will make the UAT freely available for development and use within the astronomy community, while ensuring the thesaurus remains relevant and useful. Further development of the UAT will be undertaken by the John G. Wolbach Library at the Harvard-Smithsonian Center for Astrophysics in collaboration with the Astrophysics Data System (ADS) and the International Virtual Observatory Alliance (IVOA) to enhance and extend the thesaurus to ensure that it continues to meet the needs of the astronomy community.

Continue reading “New Thesaurus Created for the Astronomy Community”