An important aspect of the adaptive immune receptor repertoire analysis is the identification of the V, D, and J genes contributing to the rearranged sequences. Often, the identification of the specific genes requires the use of a reference set of known immune receptor allelic germline variants. Since early meetings of the Adaptive Immune Receptor Repertoire Community (AIRR-C), members of the community have expressed the need to have access to germline sets open to everybody (including industry) that didn’t present known issues in the available germline sets such as different nomenclatures and the requirement of buying licenses. In February 2024, the Germline Database Working Group (GLDB WG) published the manuscript “AIRR-C IG Reference Sets: curated sets of immunoglobulin heavy and light chain germline genes” introducing the AIRR-C human immunoglobulin germline gene reference. This reference is available for download at OGRDB and also available for use in IgBLAST. The AIRR-C IG Reference Sets follow the FAIR principles (Findable, Accessible, Interoperable, and Reusable) and require a minimally restricted licence.
In episode 15 of the On AIRR podcast, Dr. William Lees (researcher at University of London and co-leader of the GLDB WG) and Dr. Corey Watson (associate professor at the University of Louisville) explain in detail the context and expectations of this effort, a major achievement of the group. Watson highlights the impressive effort that it has been for the community to get on the same page, and reach a consensus that could be materialized in these new reference sets, all during a time where the field has matured and evolved in multiple levels. “It has taken us a lot to think how to build a system that is able to accommodate all the elements and also be responsive to changes in the future. We have established a foundation that we can use to build on top of”, Watson says. Good traceability and reproducibility were also requirements from the community. In that sense, Lees explains that “the sets are free to use, can be downloaded through a REST API, they also have a DOI number, so every time a set changes, we create a new DOI, so it is a good reference to put in a paper”.
If you want to learn about the recent innovations and challenges in the generation, curation and analysis of antibody/B-cell and T-cell receptor repertoire data, and in particular in the development and use of germline resources, don’t miss the “AIRR-C Challenge Session: Using IG and TR germline resources for advancing immunobiology” in the upcoming hybrid “AIRR Community Meeting VII – Learnings and Perspectives”.