How I made a command-line bible of the German Einheitsübersetzung of 2016

Command-line bibles and how they work

There is a number of command-line bible projects on GitHub:

https://github.com/layeh/kjv

https://github.com/lukesmithxyz/kjv

https://github.com/lukesmithxyz/vul

https://github.com/lukesmithxyz/grb

https://github.com/AlexBocken/bibel

https://github.com/AlexBocken/allioli

Because the first of them (the layeh repository) has inspired the others, they all work in a similar way.

There are 3 files: a TSV (tab-separated value) file, an AWK script, and a shell script. The TSV file contains the text of the Bible, with 1 verse per line. Every line has 6 tab-separated fields: the full name of the book, the abbreviated name of the book, the number of the book, the chapter number, the verse number, and the verse. The AWK script takes selected passages or search results from the TSV file and presents them. The shell script handles command-line arguments and passes them to the AWK script.

During installation, the TSV file and the AWK script are combined to a single compressed tar file, which is then attached to the shell script. During use, the shell script decompresses this binary tail to access the AWK and TSV files.

Because the mechanism is similar for all projects, the actual verse data in the TSV file is the only significant difference between them.

The Einheitsübersetzung

The Einheitsübersetzung is the approved vernacular translation of the Catholic Bible for the German dioceses. It was first published in 1980, with a revision published in 2016.

Alexander Bocken provides a repository with a TSV file of the 1980 version of the Einheitsübersetzung (linked to above). However, there is no publicly available TSV file of the revised version from 2016. In the following sections, I describe how you can create one.

Legal concerns

To obtain a TSV file with the Einheitsübersetzung, you could buy a paper copy and type it in by hand. This would be perfectly legal (at least under German law) as long as you keep the file to yourself.

However, typing things by hand takes a lot of time. It would be easier to copy the text using your computer's clipboard. In fact, there is a legally available online version of the Einheitsübersetzung:

https://www.bibleserver.com/EU/1.Mose1

This is also the one that the German version of Wikipedia usually links to. You can easily copy the text from there into your computer's clipboard and paste it into a text file. In the following sections, I describe how you can speed this up. Because the text of the Einheitsübersetzung is protected by copyright, you can't share the resulting file of course, and I won't share my file either.

The question remains whether *creating* the file with your computer's clipboard is illegal already. Under German law, making single copies of legally obtained material for private use is legal in principle, and as long as you do it by hand (e.g. by typing it in), it definitely is. But the legal situation might be different if you use the help of a computer for it. Here is why I believe it to be fine, though:

The methods I describe are very easy to do. It's essentially Ctrl+C and Ctrl+V. If the maintainers of the online bible wanted to prevent people from doing that, they should have implemented some kind of copy protection (e.g. something as little as a pop-up whenever you hit Ctrl+C would have been sufficient), which in my understanding would render copying illegal (even if it remains technically possible).
I have already bought a paper copy of the Einheitsübersetzung. This is of course not much of a legal argument, but it does appeal to my natural feeling of justice: I already have the text, I just want to access it more easily on my computer.
The resulting file is indistinguishable from one obtained by slowly typing in everything by hand.

In any case, use your own judgment.

I don't have any doubts about the legality of this article though, since I only describe how you can use your computer according to its intended purpose, namely to facilitate simple tasks.

Capture

In the online bible linked to above, you can navigate from one chapter to the next with the right arrow key in a graphical browser. If you

use dwm,
have configured Super instead of Alt to be your dwm key,
use a keyboard layout that produces "/" with Shift+7, and
have xdotool and GNU nano installed,

you can use the following shell script to copy an entire book of the Bible by simulating key presses:

#!/bin/sh

# Let $1 be the number of chapters to be copied.
[ "$1" ] || exit 1

for i in $(seq "$1")
do
	# Go to tag 2, assuming there is a browser window with the first
	# chapter.
	xdotool key super+2

	# Copy everything.
	xdotool key ctrl+a
	xdotool key ctrl+c

	# Go to tag 3, assuming there is an empty nano editor.
	xdotool key super+3

	# Go to the end of the buffer and insert the chapter.
	xdotool key alt+shift+7
	xdotool key ctrl+shift+v

	# Wait for the pasting to finish.
	sleep 1

	# Go to tag 2 (the browser window).
	xdotool key super+2

	# Request the next chapter and wait for it to load.
	xdotool key Right
	sleep 2
done

Save the resulting text from nano's buffer in a file for the next step. Do this for all 73 books.

Format

Split and number verses

At the time of writing, there are only 2 relevant kinds of lines in the copied data:

Before each chapter, there is a line that consists only of the name of the book, a space, and the chapter number.
All lines containing verses begin with a number (the verse number) followed by a space, and subsequent verses on the same line begin with a space, the verse number, and another space.

All other lines are either empty or they are boilerplate or headings, which begin with characters that are not digits (i.e. letters or other symbols).

You can use this information to split the verses of the copied data into separate lines, number them, and remove all other lines. The following filter (i.e. it reads from stdin and writes to stdout) does that:

#!/bin/sh

# Let $1 be a regex matching the book's name and $2 be the number of the
# book.
[ "$1" ] || exit 1
[ "$2" ] || exit 1

# Only pass along lines that begin with a number (i.e. verses) or that
# consist only of the books name (as given by $1), a space, and a number
# (i.e. beginnings of chapters).
grep -e '^[0-9]' -e '^'"$1"' [1-9][0-9]*$' |

# Let every verse have its own line consisting of the verse number, a
# tab, and the verse. A verse begins when at the beginning of a line or
# after a space, a number (the verse number) is followed by a space.
sed 's/ \([1-9][0-9]*\) /\n\1\t/g; s/^\([1-9][0-9]*\) /\1\t/' |

# Attach chapter numbers (followed by a tab) to the beginning of verse
# lines, and remove the lines beginning chapters. If there are lines
# before the first chapter has begun, pass them along unaltered.
awk '
	BEGIN { chapter = 0 }
	/^'"$1"' [1-9][0-9]*$/ { chapter = $NF; next }
	chapter == 0 { print }
	chapter != 0 { printf("%s\t%s\n", chapter, $0) }' |

# Remove footnotes and attach the book number (as given by $2, followed
# by a tab) to the beginning of a line. Footnotes are square brackets
# containing a number.
sed 's/\[[0-9]*\]//g; s/^/'"$2"'\t/'

The output has 4 fields: the book number, the chapter number, the verse number, and the verse.

Note that only verses remain. Headings are dropped.

Check for erroneous splits

If a verse contains a number surrounded by spaces (as you would use it in a sentence), the script above splits it into 2 lines, treating the number as a verse number. On the other hand, the script doesn't recognize verse ranges (like "1-3" or "4-5") as verse numbers, because they contain a character that is not a digit. The following filter detects such cases. It checks whether the chapter and verse numbers increase by single steps and prints all lines where they don't. If there is no output, the input is good. Otherwise the input file requires manual intervention.

#!/bin/sh

awk -v 'FS=\t' -v 'chapter=0' -v 'verse=0' '
{
	if ($2 != chapter) {
		if ($2 != chapter+1) print
		chapter = $2
		verse = 0
	}

	if      ($3 == verse+1            ) verse += 1
	else if ($3 == verse+1 "-" verse+2) verse += 2
	else if ($3 == verse+1 "-" verse+3) verse += 3
	else print
}'

Note that this script requires its input to have 4 fields, as given by the previous script.

Insert book names and abbreviations

The following 2 pages give conventional German names and abbreviations for all 73 books:

https://de.wikipedia.org/wiki/Liste_biblischer_B%C3%BCcher

https://de.wikipedia.org/wiki/Wikipedia:Wie_zitiert_man_Bibelstellen#Abk%C3%BCrzungen_biblischer_B%C3%BCcher

If you have a file with 4 fields (as given by the script above) for each book, you can insert the first 2 fields like so:

sed -i 's/^/Genesis\tGen\t/' 01-genesis.txt

After doing that for all books, you can concatenate them to a single TSV file.

Query

The resulting TSV file integrates easily into the projects mentioned above. Just replace their TSV file with yours.

A more primitive approach would be to define a function like the following in your ~/.bashrc:

# Synopsis: bibel [ BOOK_NAME|BOOK_ABBR|BOOK_NUMBER
#                   [ CHAPTER_NUMBER [ VERSE_NUMBER ] ] ]
bibel() {
	awk -v 'FS=\t' -v "book=$1" -v "chapter=$2" -v "verse=$3" '
		(book == "" || $1 == book || $2 == book || $3 == book) &&
		(chapter == "" || $4 == chapter) &&
		(verse == "" || $5 == verse) {
			if ($5 == 1) printf("\n===== %s %s =====\n\n", $1, $4)
			print $6
		}
	' < ~/einheitsuebersetzung.tsv | fold -s | less -N
}

The paging mechanism is the same as described for the "view" script here:

/software/simple-scripts/view.sh

Ranges cannot be selected, only individual books/chapters/verses. The first verse of every chapter is preceded by a heading.

EOF