Unified URI-safe Scripture Reference

Unified URI-safe Scripture Reference 0.1 #

This standard is in WORKING DRAFT status. Any references to it should note that this is a “WORKING DRAFT and may be subject to change”.

Terminology #

Herein, for brievity, the “Unified URI-safe Scripture Reference” will be referred to as either “UUSR” or, where context allows, simply “reference” for a singular reference, or the plural “UUSRs” or “references” for multiple references.

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

Abstract #

Biblical standardisation can be a heated and controversial topic, not least with the differing beliefs on which texts should be treated as canonical. This technical standard aims to sidestep the controversy by not attempting to tread new ground but instead building upon the existing unification efforts of USFM and USX. The aim is to extend their excellent work and allow biblical references to be expressed in a character set directly compatible with URIs [RFC3986] without the need for encoding special characters.

While by no means universal, there are already numerous software applications and websites which support the UUSR syntax described herein. To our knowledge, there is currently no documented specification describing this syntax.

In the spirit of interoperability, this specification seeks to formalise the syntax to provide a central, authoritative source and point-of-reference for all technical implementations to agree upon and adhere to. The syntax is largely unaltered from the “Unified Standard Format Markers” (USFM) [Appendix 1] and “Unified Scripture XML” 3.0 (USX) [Appendix 2] scripture reference syntax, except for replacing the special characters with those that don’t require URI encoding [RFC3986]. It has been extended to allow references to specify a bible translation and additional clarification has been given in areas where the USFM and USX guidance were open to interpretation.

Syntax #

Valid references MUST contain ONLY the following characters: capital letters, numbers, the period and the dash. Whitespace MUST NOT be present anywhere within the reference.

Thus it holds that, while not guaranteeing valid syntax, all UUSRs MUST conform to the following regular expression [RFC9485]: ^[A-Z0-9.-]+$.

Implementations parsing UUSRs MAY accept or reject invalid syntax, but MUST accept all valid syntaxes.

A UUSR can refer to a singular book, chapter, and/or verse, or a range within one of these or spanning multiple of these.

Singular references #

Singular references follow the structure: [BOOK_IDENTIFIER].[CHAPTER_NUMBER].[VERSE_NUMBER], or [BOOK_IDENTIFIER].[CHAPTER_NUMBER], or [BOOK_IDENTIFIER].

The BOOK_IDENTIFIER is REQUIRED and MUST be in UPPERCASE, exactly three characters long and be listed as the USFM identifier for the particular book [Appendix 3].

The CHAPTER_NUMBER is OPTIONAL. When present it must be an integer above zero and no greater than the number of chapters in the book. If it is omitted, the leading period character MUST also be omitted.

The VERSE_NUMBER is OPTIONAL, but MUST NOT be present if CHAPTER_NUMBER is omitted. When present it must be an integer above zero and no greater than the number of verses in the chapter. If VERSE_NUMBER is omitted, the leading period character MUST also be omitted.

Examples of valid syntax: #

  • MAT
  • MAT.2
  • MAT.2.10
  • 1CO
  • 1CO.13
  • 1CO.13.4

Examples of invalid syntax: #

  • MAT.
  • MAT.2.
  • MAT..2
  • mat.2.10
  • Mat.2.10
  • Matthew.2.10
  • MAT 2.10
  • MAT 2:10
  • 1 CO
  • 1 CO.13
  • 1 CO.13.4
  • 1 COR.13.4

Range references #

Range references are made up of two singular references joined by a hyphen - character, and follow one of the structures:

  • [BOOK_IDENTIFIER]-[BOOK_IDENTIFIER]
  • [BOOK_IDENTIFIER].[CHAPTER_NUMBER]-[CHAPTER_NUMBER]
  • [BOOK_IDENTIFIER].[CHAPTER_NUMBER].[VERSE_NUMBER]-[VERSE_NUMBER]
  • [BOOK_IDENTIFIER].[CHAPTER_NUMBER].[VERSE_NUMBER]-[CHAPTER_NUMBER].[VERSE_NUMBER]
  • [BOOK_IDENTIFIER].[CHAPTER_NUMBER].[VERSE_NUMBER]-[BOOK_IDENTIFIER].[CHAPTER_NUMBER].[VERSE_NUMBER]

If a range is within the same chapter, the chapter number MUST only appear once.

If a range is within the same book, the book name MUST only appear once.

If a range spans multiple books and is in whole chapters, the verse SHOULD be omitted unless it serves to reduce ambiguity such as the end of John 7.

If a range spans multiple books and is in whole books, both the verse and chapter SHOULD be omitted unless it serves to reduce ambiguity.

Examples of valid syntax: #

  • MAT.2.1-12 (which should be interpreted as verses 1-12 in chapter 2)
  • MAT.1.18-2.12 (which should be interpreted as from chapter 1 verse 18 to chapter 2 verse 12)
  • MAT.3-4 (which should be interpreted as the start of chapter 3 to the end of chapter 4)
  • MAT-JHN
  • 1CO.13.4-13

Example of discouraged syntax: #

  • MAT.3.1-4.25 (this is two whole chapters, so it SHOULD be written as MAT.3-4)
  • MAT.1.1-JHN.21.25 (this is two whole chapters, so it SHOULD be written as MAT-JHN)

Examples of invalid syntax: #

  • MAT.2.1-2.12
  • MAT.3.1-MAT.4.25
  • JHN.3.16-JHN.3.17
  • 1 CO.13.4-13

Translation #

There are many translations of the bible. Sometimes is it useful to specify a translation in a reference, in the structure: [SINGULAR_REFERENCE].[TRANSLATION_ABBREVIATION] or [RANGE_REFERENCE].[TRANSLATION_ABBREVIATION].

A bible translation in a UUSR is OPTIONAL and MAY be appended to either a singular reference or a range reference.

The TRANSLATION_ABBREVIATION MUST be the commonly accepted abbreviation for the translation and be in UPPERCASE.

The TRANSLATION_ABBREVIATION MUST only contain the characters A-Z and 0-9. It MUST NOT include a hyphen character, or period character, or any other character.

Since a reference can only be in one bible translation, the TRANSLATION_ABBREVIATION MUST only appear once in a reference.

The TRANSLATION_ABBREVIATION MUST only be appended at the end of reference, regardless of if it is a singular or range reference.

The TRANSLATION_ABBREVIATION MUST be separated from the reference by a single period character.

Implementations parsing UUSR references MUST adhere to the given TRANSLATION_ABBREVIATION if it is available in their software.

Implementations parsing UUSR references SHOULD ignore the TRANSLATION_ABBREVIATION if it is unavailable in their software, and SHOULD select the next-closest translation. If this happens, the user SHOULD be informed.

Examples of valid syntax: #

  • JHN.3.16.NIVUK
  • JHN.3.16.KJV
  • JHN.3.16.ESV
  • JHN.3.16.ESVCE
  • JHN.3.16-17.NIVUK

Examples of invalid syntax: #

  • JHN.3.16.NIVUK-17
  • JHN.3.16.NIVUK-17.NIVUK
  • JHN.3.16.NIVUK-JHN.3.17.NIVUK

Appendices #

Appendix 1: Standard USFM scripture reference #

From: https://ubsicap.github.io/usfm/linking/index.html on ??? date.

When a standard USFM scripture reference is required, you must provide a string of pattern: [A-Z1-4]{3}(-[A-Z1-4]{3})? ?[a-z0-9-:]*

Book names must be one of the standard Book Identifiers
Chapter verse separator is always a colon :
Verse ranges are indicated using a hyphen

Example: MAT 3:1-4

Appendix 2: Standard USX scripture reference #

From: https://ubsicap.github.io/usx/linking.html on ??? date.

When a standard USX scripture reference is required, you must provide a string of pattern: [A-Z1-4]{3} ?[a-z0-9-,:]*

Book names must be one of bookCode
Chapter verse separator is always a colon :
Verse ranges are indicated using a hyphen

Example: MAT 3:1-4

Appendix 3: USFM and USX 3-character book identifiers #

From: https://ubsicap.github.io/usfm/identification/books.html on ??? date.

table of book identifiers here