Post

LaTeX Reference Checker: Bash Script to Find Unused Labels and Missing Citations

A robust bash script that audits LaTeX documents to find unreferenced figures, tables, equations, and citation issues while ignoring commented code

LaTeX Reference Checker: Bash Script to Find Unused Labels and Missing Citations

Overview

When preparing research manuscripts, it’s common to accumulate unused labels, orphaned citations, and commented-out content. This bash script provides a comprehensive audit of your LaTeX document, identifying:

  • Unreferenced figures, tables, and equations
  • Unused bibliography entries
  • Missing citations (cited but not in .bib file)
  • All while properly ignoring commented lines

Key Feature: The script preserves the sequence in which labels appear in your document, making it easier to locate and manage them.


The Problem

During manuscript development, you might encounter:

IssueExampleImpact
Unreferenced labels\label{fig:old_analysis} never usedClutters document, confuses reviewers
Commented labels counted% \label{tab:removed} still detectedFalse positives in checks
Missing citations\cite{smith2023} but not in .bibCompilation errors
Unused bibliographyReferences added but never citedInflates reference count

Manual checking across 50+ pages with multiple revisions becomes impractical.


The Solution

Features

Comment-aware: Ignores both full-line and inline comments
Sequence-preserving: Shows labels in document order
Comprehensive: Checks figures, tables, equations, and citations
Multiple reference styles: Supports \ref, \autoref, and \eqref
Lightweight: Pure bash, no dependencies
Fast: Processes typical manuscripts in <1 second


Installation and Usage

Step 1: Create the Script

Copy the following code and save it as check_latex_refs.sh:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
#!/bin/bash

if [ $# -eq 0 ]; then
    echo "Usage: $0 <latex_file.tex>"
    exit 1
fi

TEXFILE=$1

# Remove comments in two steps:
# Step 1: Remove lines that start with % (including whitespace before %)
# Step 2: Remove inline comments (everything after % that's not \%)
TEXCONTENT=$(grep -v '^\s*%' "$TEXFILE" | sed 's/\([^\\]\)%.*$/\1/')

echo "Checking references in: $TEXFILE"
echo "========================================"

# Tables (preserving order, excluding comments)
echo -e "\n=== TABLES ==="
TAB_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{tab:\K[^}]+')
TAB_LABELS_UNIQ=$(echo "$TAB_LABELS" | awk '!seen[$0]++')
# Catches both \ref and \autoref
TAB_REFS=$(echo "$TEXCONTENT" | grep -oP '\\(auto)?ref\{tab:\K[^}]+' | awk '!seen[$0]++')
TAB_COUNT=$(echo "$TAB_LABELS_UNIQ" | grep -v '^$' | wc -l)
REF_COUNT=$(echo "$TAB_REFS" | grep -v '^$' | wc -l)

echo "Total labels: $TAB_COUNT"
echo "Total refs: $REF_COUNT"
echo "Unreferenced tables:"

while IFS= read -r label; do
    if ! echo "$TAB_REFS" | grep -qx "$label"; then
        echo "  tab:$label"
    fi
done <<< "$TAB_LABELS_UNIQ"

# Figures (preserving order, excluding comments)
echo -e "\n=== FIGURES ==="
FIG_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{fig:\K[^}]+')
FIG_LABELS_UNIQ=$(echo "$FIG_LABELS" | awk '!seen[$0]++')
# Catches both \ref and \autoref
FIG_REFS=$(echo "$TEXCONTENT" | grep -oP '\\(auto)?ref\{fig:\K[^}]+' | awk '!seen[$0]++')
FIG_COUNT=$(echo "$FIG_LABELS_UNIQ" | grep -v '^$' | wc -l)
FIGREF_COUNT=$(echo "$FIG_REFS" | grep -v '^$' | wc -l)

echo "Total labels: $FIG_COUNT"
echo "Total refs: $FIGREF_COUNT"
echo "Unreferenced figures:"

while IFS= read -r label; do
    if ! echo "$FIG_REFS" | grep -qx "$label"; then
        echo "  fig:$label"
    fi
done <<< "$FIG_LABELS_UNIQ"

# Equations (preserving order, excluding comments)
echo -e "\n=== EQUATIONS ==="
EQ_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{eq:\K[^}]+')
EQ_LABELS_UNIQ=$(echo "$EQ_LABELS" | awk '!seen[$0]++')
# Catches \ref, \autoref, and \eqref
EQ_REFS=$(echo "$TEXCONTENT" | grep -oP '\\((auto|eq))?ref\{eq:\K[^}]+' | awk '!seen[$0]++')
EQ_COUNT=$(echo "$EQ_LABELS_UNIQ" | grep -v '^$' | wc -l)
EQREF_COUNT=$(echo "$EQ_REFS" | grep -v '^$' | wc -l)

echo "Total labels: $EQ_COUNT"
echo "Total refs: $EQREF_COUNT"
echo "Unreferenced equations:"

while IFS= read -r label; do
    if ! echo "$EQ_REFS" | grep -qx "$label"; then
        echo "  eq:$label"
    fi
done <<< "$EQ_LABELS_UNIQ"

# Citations (excluding comments)
echo -e "\n=== CITATIONS ==="
BIBFILE=$(echo "$TEXCONTENT" | grep -oP '\\bibliography\{\K[^}]+' | head -1)

if [ -n "$BIBFILE" ]; then
    [[ "$BIBFILE" != *.bib ]] && BIBFILE="${BIBFILE}.bib"
    
    if [ -f "$BIBFILE" ]; then
        echo "Using bibliography file: $BIBFILE"
        
        BIB_ENTRIES=$(grep -oP '@\w+\{\K[^,]+' "$BIBFILE" | awk '!seen[$0]++')
        CITATIONS=$(echo "$TEXCONTENT" | grep -oP '\\cite[tp]?\{\K[^}]+' | tr ',' '\n' | sed 's/^[[:space:]]*//' | awk '!seen[$0]++')
        
        BIB_COUNT=$(echo "$BIB_ENTRIES" | grep -v '^$' | wc -l)
        CITE_TOTAL=$(echo "$TEXCONTENT" | grep -oP '\\cite[tp]?\{[^}]+\}' | wc -l)
        CITE_UNIQ=$(echo "$CITATIONS" | grep -v '^$' | wc -l)
        
        echo "Total bib entries: $BIB_COUNT"
        echo "Total citation commands: $CITE_TOTAL"
        echo "Unique cited keys: $CITE_UNIQ"
        
        echo "Uncited bibliography entries:"
        while IFS= read -r entry; do
            if ! echo "$CITATIONS" | grep -qx "$entry"; then
                echo "  $entry"
            fi
        done <<< "$BIB_ENTRIES"
        
        echo "Missing bibliography entries (cited but not in .bib):"
        while IFS= read -r cite; do
            if ! echo "$BIB_ENTRIES" | grep -qx "$cite"; then
                echo "  $cite"
            fi
        done <<< "$CITATIONS"
    else
        echo "Bibliography file not found: $BIBFILE"
    fi
else
    echo "No bibliography file found in document"
fi

echo -e "\n========================================"
echo "Check complete!"

Step 2: Make it Executable

1
chmod +x check_latex_refs.sh

Step 3: Run the Checker

1
./check_latex_refs.sh manuscript.tex

Example Output

Here’s what the script reports for a typical manuscript:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Checking references in: paper.tex
========================================

=== TABLES ===
Total labels: 11
Total refs: 8
Unreferenced tables:
  tab:appendix_data
  tab:extra_results
  tab:summary_stats

=== FIGURES ===
Total labels: 6
Total refs: 6
Unreferenced figures:

=== EQUATIONS ===
Total labels: 15
Total refs: 12
Unreferenced equations:
  eq:supplementary
  eq:variance
  eq:alt_form

=== CITATIONS ===
Using bibliography file: references.bib
Total bib entries: 45
Total citation commands: 52
Unique cited keys: 38
Uncited bibliography entries:
  smith2020old
  jones2019unused
  brown2018extra
Missing bibliography entries (cited but not in .bib):
  nguyen2023missing

========================================
Check complete!

Note: The script detects references made with \ref{fig:one}, \autoref{tab:results}, or \eqref{eq:main} - all are counted correctly.


Technical Deep Dive

How It Works

1. Comment Removal

The script uses a two-step approach to handle comments:

1
2
3
# Step 1: Remove lines starting with %
# Step 2: Remove inline comments but preserve \%
TEXCONTENT=$(grep -v '^\s*%' "$TEXFILE" | sed 's/\([^\\]\)%.*$/\1/')

Important: The order matters! We must remove full-line comments first, then handle inline comments.

Test cases:

LaTeX CodeProcessed As
\label{fig:test}✓ Included
% \label{fig:old}✗ Excluded
Text \ref{fig:a} % comment\ref{fig:a} included
50\% efficiency\% preserved

2. Sequence Preservation

Uses awk '!seen[$0]++' to remove duplicates while maintaining order:

1
2
3
4
5
# Traditional approach (loses order)
sort -u

# Our approach (preserves order)
awk '!seen[$0]++'

Why this matters:

If your document has:

1
2
3
4
5
6
7
8
\section{Methods}        % Line 50
\label{fig:workflow}     % Line 65

\section{Results}        % Line 150
\label{fig:accuracy}     % Line 170

\section{Appendix}       % Line 300
\label{fig:extra}        % Line 310 (unreferenced)

The output shows fig:extra in document order (not alphabetically sorted), making it easier to locate.

3. Pattern Matching

Uses Perl-compatible regex with grep -oP:

1
2
3
4
5
# For tables and figures: catches both \ref and \autoref
grep -oP '\\(auto)?ref\{tab:\K[^}]+'

# For equations: catches \ref, \autoref, and \eqref
grep -oP '\\((auto|eq))?ref\{eq:\K[^}]+'

Breakdown:

  • \\(auto)?ref\{tab: - Match \ref{tab: or \autoref{tab:
  • \\((auto|eq))?ref\{eq: - Match \ref{eq:, \autoref{eq:, or \eqref{eq:
  • \K - Discard everything matched so far
  • [^}]+ - Capture everything until }

Supported reference commands:

  • \ref{...} - Standard LaTeX reference
  • \autoref{...} - Automatic reference from hyperref package
  • \eqref{...} - Equation reference (for equations only)

Examples: Examples:

Labels detected:

  • \label{tab:results} → extracts results
  • \label{fig:analysis_2023} → extracts analysis_2023
  • \label{eq:main_theorem} → extracts main_theorem

References detected:

  • \ref{fig:diagram} → matches diagram
  • \autoref{tab:results} → matches results
  • \eqref{eq:einstein} → matches einstein

Not matched:

  • % \label{tab:old} (filtered by comment removal)
  • \label{sec:intro} (different prefix - use section extension)

Advanced Usage

1. Check Multiple Files

Create check_all.sh:

1
2
3
4
5
6
7
8
9
#!/bin/bash

for file in *.tex; do
    echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
    echo "File: $file"
    echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
    ./check_latex_refs.sh "$file"
    echo ""
done

2. Generate Timestamped Reports

1
2
3
4
5
6
7
8
# Create dated log
./check_latex_refs.sh paper.tex > "audit_$(date +%Y%m%d_%H%M%S).log"

# Compare before/after revision
./check_latex_refs.sh paper.tex > before_revision.log
# ... make changes ...
./check_latex_refs.sh paper.tex > after_revision.log
diff before_revision.log after_revision.log

3. Git Pre-commit Hook

Add to .git/hooks/pre-commit:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/bash

# Run check
./check_latex_refs.sh main.tex > ref_check.log

# Count unreferenced items
UNREFERENCED=$(grep -c "^  " ref_check.log)

if [ $UNREFERENCED -gt 0 ]; then
    echo "⚠️  Warning: $UNREFERENCED unreferenced items found"
    cat ref_check.log
    
    read -p "Continue commit? (y/n) " -n 1 -r
    echo
    if [[ ! $REPLY =~ ^[Yy]$ ]]; then
        exit 1
    fi
fi

4. Makefile Integration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Makefile

.PHONY: check clean all

all: manuscript.pdf check

manuscript.pdf: manuscript.tex
	pdflatex manuscript.tex
	bibtex manuscript
	pdflatex manuscript.tex
	pdflatex manuscript.tex

check:
	@./check_latex_refs.sh manuscript.tex

clean:
	rm -f *.aux *.log *.bbl *.blg *.out *.toc

audit:
	@./check_latex_refs.sh manuscript.tex > audit_$(shell date +%Y%m%d).log
	@cat audit_$(shell date +%Y%m%d).log

Usage:

1
2
3
make              # Compile and check
make check        # Run reference check only
make audit        # Generate timestamped audit

5. Pre-submission Checklist Script

Create pre_submit.sh:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#!/bin/bash

echo "📋 Pre-submission Checklist"
echo "═══════════════════════════"

# 1. Reference check
echo "1️⃣  Checking references..."
./check_latex_refs.sh manuscript.tex > ref_audit.log

# 2. Count issues
ISSUES=$(grep -c "^  " ref_audit.log)
if [ $ISSUES -eq 0 ]; then
    echo "   ✓ No unreferenced items"
else
    echo "   ⚠️  $ISSUES unreferenced items found"
    cat ref_audit.log
fi

# 3. Check compilation
echo "2️⃣  Checking compilation..."
pdflatex -interaction=nonstopmode manuscript.tex > /dev/null 2>&1
if [ $? -eq 0 ]; then
    echo "   ✓ Compiles successfully"
else
    echo "   ✗ Compilation errors found"
fi

# 4. Check bibliography
echo "3️⃣  Checking bibliography..."
bibtex manuscript > /dev/null 2>&1
echo "   ✓ Bibliography processed"

echo "═══════════════════════════"
echo "✅ Checklist complete!"

Extensions and Customizations

Add Line Numbers

Modify the unreferenced item display to show line numbers:

1
2
3
4
5
6
while IFS= read -r label; do
    if ! echo "$TAB_REFS" | grep -qx "$label"; then
        LINE=$(grep -n "\\label{tab:$label}" "$TEXFILE" | cut -d: -f1)
        echo "  tab:$label (line $LINE)"
    fi
done <<< "$TAB_LABELS_UNIQ"

Output:

1
2
3
Unreferenced tables:
  tab:appendix_data (line 145)
  tab:extra_results (line 203)

Add Section Labels

Add this after the equations section:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Sections (preserving order, excluding comments)
echo -e "\n=== SECTIONS ==="
SEC_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{sec:\K[^}]+')
SEC_LABELS_UNIQ=$(echo "$SEC_LABELS" | awk '!seen[$0]++')
SEC_REFS=$(echo "$TEXCONTENT" | grep -oP '\\ref\{sec:\K[^}]+' | awk '!seen[$0]++')
SEC_COUNT=$(echo "$SEC_LABELS_UNIQ" | grep -v '^$' | wc -l)
SECREF_COUNT=$(echo "$SEC_REFS" | grep -v '^$' | wc -l)

echo "Total section labels: $SEC_COUNT"
echo "Total section refs: $SECREF_COUNT"
echo "Unreferenced sections:"

while IFS= read -r label; do
    if ! echo "$SEC_REFS" | grep -qx "$label"; then
        echo "  sec:$label"
    fi
done <<< "$SEC_LABELS_UNIQ"

Color Output

Add ANSI color codes for better visibility:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Add at top of script (after shebang)
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color

# Use in output
echo -e "${BLUE}=== TABLES ===${NC}"
echo "Total labels: $TAB_COUNT"
echo "Total refs: $REF_COUNT"

if [ "$TAB_COUNT" -eq "$REF_COUNT" ]; then
    echo -e "${GREEN}✓ All tables referenced!${NC}"
else
    echo -e "${RED}Unreferenced tables:${NC}"
    while IFS= read -r label; do
        if ! echo "$TAB_REFS" | grep -qx "$label"; then
            echo -e "  ${YELLOW}tab:$label${NC}"
        fi
    done <<< "$TAB_LABELS_UNIQ"
fi

Export to CSV

Create a CSV export function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/bash
# Export unreferenced items to CSV

TEXFILE=$1
OUTFILE="${TEXFILE%.tex}_unreferenced.csv"

# Run the check and extract unreferenced items
./check_latex_refs.sh "$TEXFILE" > /tmp/ref_check.log

# Create CSV header
echo "Type,Label,Section" > "$OUTFILE"

# Extract and format
grep "^  tab:" /tmp/ref_check.log | sed 's/^  /table,/' >> "$OUTFILE"
grep "^  fig:" /tmp/ref_check.log | sed 's/^  /figure,/' >> "$OUTFILE"
grep "^  eq:" /tmp/ref_check.log | sed 's/^  /equation,/' >> "$OUTFILE"

echo "CSV exported to: $OUTFILE"

JSON Output

Add JSON export capability:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#!/bin/bash
# Add --json flag support

if [ "$2" == "--json" ]; then
    # Extract data
    TAB_LABELS=$(echo "$TEXCONTENT" | grep -oP '\\label\{tab:\K[^}]+' | awk '!seen[$0]++')
    TAB_REFS=$(echo "$TEXCONTENT" | grep -oP '\\ref\{tab:\K[^}]+' | awk '!seen[$0]++')
    
    # Build unreferenced array
    UNREF_TABS=$(comm -23 <(echo "$TAB_LABELS" | sort) <(echo "$TAB_REFS" | sort) | \
        awk '{printf "\"%s\",", $0}' | sed 's/,$//')
    
    # Output JSON
    cat << EOF
{
  "file": "$TEXFILE",
  "timestamp": "$(date -Iseconds)",
  "tables": {
    "total_labels": $(echo "$TAB_LABELS" | grep -v '^$' | wc -l),
    "total_refs": $(echo "$TAB_REFS" | grep -v '^$' | wc -l),
    "unreferenced": [$UNREF_TABS]
  }
}
EOF
fi

Workflow Integration Examples

1. Continuous Integration (CI)

For GitHub Actions (.github/workflows/latex-check.yml):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
name: LaTeX Reference Check

on: [push, pull_request]

jobs:
  check-refs:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      
      - name: Run reference checker
        run: |
          chmod +x check_latex_refs.sh
          ./check_latex_refs.sh manuscript.tex
          
      - name: Count issues
        run: |
          ISSUES=$(./check_latex_refs.sh manuscript.tex | grep "^  " | wc -l)
          echo "Found $ISSUES unreferenced items"
          if [ $ISSUES -gt 5 ]; then
            echo "Too many unreferenced items!"
            exit 1
          fi

2. VS Code Task

Add to .vscode/tasks.json:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
{
  "version": "2.0.0",
  "tasks": [
    {
      "label": "Check LaTeX References",
      "type": "shell",
      "command": "./check_latex_refs.sh",
      "args": ["${file}"],
      "problemMatcher": [],
      "presentation": {
        "reveal": "always",
        "panel": "new"
      }
    }
  ]
}

3. Overleaf/Git Sync

For projects synced between Overleaf and Git:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/bash
# sync_and_check.sh

# Pull from Overleaf
git pull origin master

# Run check
./check_latex_refs.sh main.tex > ref_check.log

# If clean, push changes
if [ $(grep -c "^  " ref_check.log) -eq 0 ]; then
    git add .
    git commit -m "Update: references checked"
    git push origin master
else
    echo "⚠️  Unreferenced items found. Fix before pushing."
    cat ref_check.log
fi

Comparison with Alternatives

ToolProsConsBest For
This scriptFast, customizable, comment-aware, preserves orderSingle file onlyQuick audits, CI/CD
refcheck packageLaTeX-integrated, visual markersMust recompile, clutters PDFDuring writing
chktexComprehensive linting, many checksVerbose, complex outputDeep analysis
VS Code LaTeX WorkshopReal-time, editor-integratedEditor-specific, no batchActive editing
Python scriptsVery flexible, HTML reportsSlower, requires PythonCustom workflows

Real-World Case Study

Initial State

Manuscript for journal submission with 8 months of revisions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
=== TABLES ===
Total labels: 8
Total refs: 7
Unreferenced tables:
  tab:correlation_matrix

=== FIGURES ===
Total labels: 12
Total refs: 12

=== EQUATIONS ===
Total labels: 6
Total refs: 6

=== CITATIONS ===
Total bib entries: 67
Uncited bibliography entries:
  lecun2015deep
  goodfellow2016deep
  chollet2017deep
  bishop2006pattern
  murphy2012machine
  hastie2009elements
  james2013introduction
  vapnik1995nature
  cover2006elements

Actions Taken

  1. Moved tab:correlation_matrix to supplementary materials
  2. Removed 9 unused deep learning references (leftover from earlier drafts)
  3. Cleaned up bibliography from 67 → 58 entries
  4. Verified all remaining citations were relevant

Final State

1
2
3
4
5
6
7
8
=== TABLES ===
Total labels: 7
Total refs: 7
Unreferenced tables:

=== CITATIONS ===
Total bib entries: 58
Uncited bibliography entries:

Time saved: ~30 minutes of manual checking
Outcome: Clean submission, no reviewer comments about references


Limitations and Workarounds

Current Limitations

  1. Single file only: Doesn’t traverse \input{} or \include{} commands
  2. Standard prefixes: Assumes tab:, fig:, eq:, sec: conventions
  3. Reference commands: Detects \ref, \autoref, and \eqref (for equations)
  4. Citation commands: Only detects \cite, \citep, \citet
  5. Bibliography format: Requires standard \bibliography{} command

Workarounds

For multi-file projects:

1
2
3
4
# Combine files first
cat main.tex chapter*.tex appendix.tex > combined.tex
./check_latex_refs.sh combined.tex
rm combined.tex

For custom prefixes:

1
2
3
# Modify the grep patterns in the script
# Change tab: to tbl:
grep -oP '\\label\{tbl:\K[^}]+'

For additional citation commands:

1
2
# Add to CITATIONS line:
CITATIONS=$(echo "$TEXCONTENT" | grep -oP '\\cite(p|t|author|year|)?\{\K[^}]+' | ...)

For biblatex:

1
2
# Change BIBFILE detection:
BIBFILE=$(echo "$TEXCONTENT" | grep -oP '\\addbibresource\{\K[^}]+' | head -1)

Troubleshooting

Issue: Script shows no output

Solution:

1
2
3
4
5
6
7
8
9
# Check file exists and has content
ls -la manuscript.tex
wc -l manuscript.tex

# Check for labels
grep "\\label{" manuscript.tex | head -5

# Run with debug mode
bash -x check_latex_refs.sh manuscript.tex

Issue: Bibliography not found

Solution:

1
2
3
4
5
6
7
8
# Check bibliography command exists
grep "bibliography" manuscript.tex

# Verify .bib file location
ls -la *.bib

# Check file permissions
ls -la references.bib

Issue: Commented lines still appear

Solution:

1
2
3
4
5
6
7
8
# Check sed version
sed --version

# Try alternative comment removal
TEXCONTENT=$(grep -v '^\s*%' "$TEXFILE" | sed 's/[^\\]%.*$//')

# Or use Perl
TEXCONTENT=$(perl -pe 's/([^\\])%.*$/$1/' "$TEXFILE")

Issue: Special characters in labels

Solution:

1
2
3
4
5
# Escape special characters in grep
grep -oP '\\label\{tab:\K[^}]+' | sed 's/[._-]//g'

# Or use simpler pattern
grep "\\label{tab:" | cut -d'{' -f2 | cut -d'}' -f1

Issue: False positives

Solution:

1
2
3
4
5
# Check if label actually exists in source
grep -n "label{tab:problematic}" manuscript.tex

# Verify it's not in a comment
grep "tab:problematic" manuscript.tex | grep -v "^%"

Performance Optimization

For very large documents (>10,000 lines):

1
2
3
4
5
6
7
8
9
# Use parallel processing for multiple files
find . -name "*.tex" | parallel ./check_latex_refs.sh {}

# Cache intermediate results
TEXCONTENT=$(sed 's/\([^\\]\)%.*$/\1/' "$TEXFILE" | tee /tmp/cleaned.tex)

# Use faster grep alternatives
# Install ripgrep: sudo apt install ripgrep
rg -oP '\\label\{tab:\K[^}]+' "$TEXFILE"

Contributing and Feedback

This tool is open for improvements! If you:

  • Find bugs or edge cases
  • Have feature suggestions
  • Want to add support for other label types
  • Created useful extensions

Please share your feedback or contribute to the project.


Version History

v1.0 (2026-01-02)

  • Initial release
  • Support for tables, figures, equations, citations
  • Comment filtering (inline and full-line)
  • Sequence-preserving output

Planned features:

  • Support for \input{} and \include{}
  • Algorithm and listing labels
  • HTML output format
  • Configurable label prefixes
  • Multi-language support

License

MIT License - Free to use, modify, and distribute.

1
2
3
4
5
6
7
8
9
10
11
12
13
MIT License

Copyright (c) 2026 ScholarNote

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

Citation

If you use this tool in your research workflow:

1
2
3
4
5
6
7
@misc{latex_ref_checker_2026,
  title={LaTeX Reference Checker: Bash Script for Document Auditing},
  author={ScholarNote},
  year={2026},
  url={https://scholarsnote.org/latex-reference-checker-bash/},
  note={Accessed: 2026-01-02}
}

Last updated: January 2, 2026
Tested on: Ubuntu 20.04+, macOS 12+, Git Bash (Windows)
Script version: 1.0


ScholarNote provides free tools and resources for academic researchers. Follow us for updates on new tools and LaTeX productivity tips.

This post is licensed under CC BY 4.0 by the author.