Setting up automated policy compliance checks for university grants

Q: Is it safe to re-run the checker over the same expenditure batch?

Yes. evaluate_compliance is stateless and reads the matrix without mutating it, and generate_audit_hash is deterministic, so a repeat run yields an identical status and hash. An accidental double run is harmless.

Q: Should the checker auto-approve or correct a flagged purchase?

No. A FLAGGED or FALLBACK_ROUTED result surfaces the record for human review and never silently approves or edits it. The pipeline adjudicates structure and policy, then hands ambiguity to a compliance officer.

On this page

Problem statement
Prerequisites
Policy and regulatory alignment
Step-by-step implementation
Step 1 — Define the record contract and status enumeration
Step 2 — Compute a deterministic audit fingerprint
Step 3 — Evaluate against the policy matrix (stateless and idempotent)
Step 4 — Wrap evaluation in an idempotent entry point
Schema and field reference
Verification
Troubleshooting
Frequently asked questions
Related

Problem statement

You need a deterministic Python service that evaluates every grant expenditure against the institution’s active policy matrix — federal cost principles, sponsor rules, and EHS restrictions — and resolves each record to one auditable status before the purchase is committed, so that re-running the same batch never produces a different decision or breaks a federal audit.

This task is anchored to its parent cluster, University Policy Mapping Frameworks, and to the broader practice set out in Core Architecture & Policy Mapping for Research Grants. The compliance check is intentionally narrow: it consumes a normalized expenditure record, applies the version-controlled rule set, and emits a deterministic status plus an audit fingerprint. It does not acquire data, post to the ERP, or override a human decision — it screens, fingerprints, and routes anomalies to a compliance officer, consistent with the separation of concerns the parent architecture establishes.

Prerequisites

Before deploying the checker, confirm the following environment and policy configuration:

Python 3.10+ (the code uses modern type hints, enum.Enum, frozen dataclasses, and datetime.timezone.utc).
Libraries: none beyond the standard library — json, hashlib, logging, datetime, re, dataclasses, and enum. Keeping the evaluation core dependency-free is deliberate: it makes the audit logic trivial to vendor, pin, and reproduce. Records are expected to arrive already validated by the How to Map NIH Grant Schemas to Internal Databases mapper, which uses Pydantic and SQLAlchemy upstream.
Environment configuration (never hard-code credentials or matrix paths, per Security Boundary Configuration):
- COMPLIANCE_MATRIX_PATH — filesystem path or object-store URI of the active policy matrix.
- COMPLIANCE_AUDIT_LOG — append-only path for the immutable audit log.
Policy config: a version-controlled policy_matrix JSON document defining unallowable_keywords (e.g. alcohol, entertainment, first-class), ehs_restricted_categories, and the recognized cost-category enumeration. This document is the single source of truth for the rules below and must carry a version hash.

Policy and regulatory alignment

The matrix maps directly to the agency mandates that bound every screened record:

NIH (2 CFR 200): allowable costs, direct vs. indirect allocation, and prohibited items (alcohol, entertainment, first-class travel).
NSF (PAPPG): budget-period alignment, subaward monitoring thresholds, and equipment capitalization rules.
OSHA & EPA: procurement of regulated chemicals, biohazards, or hazardous-waste disposal that requires institutional EHS pre-approval before financial commitment.
Institutional F&A: modified total direct cost (MTDC) calculations aligned with the negotiated indirect cost rate agreement (NICRA).

Rule updates propagate by loading a new matrix version — no service restart, no code change — which preserves continuous audit readiness.

Step-by-step implementation

The flow below is enforced by the checker: a normalized record is type-checked, then routed through ordered gates — unallowable-keyword screen, EHS-restriction screen, and cost-category recognition — each resolving to one deterministic status. Identical inputs always produce identical outputs and identical hashes, which is what makes a re-run safe.

Figure: each expenditure flows through ordered gates that resolve to one deterministic compliance status.

Step 1 — Define the record contract and status enumeration

The frozen dataclass is the schema boundary: a record that does not satisfy these fields and types never reaches policy evaluation. The ComplianceStatus enum closes the set of outcomes so no integration can invent an undefined state.

python

import json
import hashlib
import logging
import datetime
import re
from dataclasses import dataclass, asdict
from typing import Dict, List, Tuple, Any
from enum import Enum

# Audit-safe logging: append-only file handler, structured single-line entries.
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s | %(levelname)s | %(message)s",
    handlers=[logging.FileHandler("compliance_audit.log", mode="a")],
)
logger = logging.getLogger("grant_compliance_engine")


class ComplianceStatus(Enum):
    APPROVED = "APPROVED"
    FLAGGED = "FLAGGED"
    REJECTED = "REJECTED"
    FALLBACK_ROUTED = "FALLBACK_ROUTED"


class ComplianceError(Exception):
    """Explicit business-rule failure (distinct from a schema failure)."""


@dataclass(frozen=True)
class ExpenditureRecord:
    transaction_id: str
    vendor_id: str
    cost_category: str
    allocation_pct: float
    funding_source: str
    amount: float
    description: str
    timestamp: str

Step 2 — Compute a deterministic audit fingerprint

The fingerprint is computed from the canonicalized record (sort_keys=True) so the same record always hashes to the same digest. This is the non-repudiation anchor: the recorded hash proves exactly which record was screened.

python

def generate_audit_hash(record: ExpenditureRecord) -> str:
    """SHA-256 over the canonical record — guarantees idempotent tracking."""
    payload = json.dumps(asdict(record), sort_keys=True, default=str)
    return hashlib.sha256(payload.encode("utf-8")).hexdigest()

Step 3 — Evaluate against the policy matrix (stateless and idempotent)

The evaluator applies the gates in policy order. It reads the matrix but mutates no external state, so concurrent retries are safe. Each early return carries the agency rationale that an auditor will read.

python

def evaluate_compliance(
    record: ExpenditureRecord, policy_matrix: Dict[str, Any]
) -> Tuple[ComplianceStatus, str, str]:
    """Evaluate one expenditure. Returns (status, rationale, audit_hash)."""
    audit_hash = generate_audit_hash(record)
    rationale: List[str] = []

    # Pre-flight type/range checks — a violation here is a hard business error.
    if record.amount <= 0.0:
        raise ComplianceError("Transaction amount must be positive.")
    if not (0.0 <= record.allocation_pct <= 100.0):
        raise ComplianceError("Allocation percentage must be between 0 and 100.")

    category = record.cost_category.lower()
    desc = record.description.lower()

    # Gate 1 — NIH/NSF unallowable-cost screen (2 CFR 200 / NSF PAPPG).
    if any(term in desc for term in policy_matrix.get("unallowable_keywords", [])):
        rationale.append("Contains unallowable cost keywords per 2 CFR 200 / NSF PAPPG.")
        return ComplianceStatus.REJECTED, " | ".join(rationale), audit_hash

    # Gate 2 — OSHA/EPA hazardous-procurement screen.
    if category in policy_matrix.get("ehs_restricted_categories", []):
        rationale.append("Requires EHS pre-approval (OSHA/EPA alignment).")
        return ComplianceStatus.FLAGGED, " | ".join(rationale), audit_hash

    # Gate 3 — recognized cost category, else route to the manual queue.
    if not re.match(r"^(equipment|supplies|travel|personnel|other)$", category):
        rationale.append("Ambiguous cost category routed to manual compliance queue.")
        return ComplianceStatus.FALLBACK_ROUTED, " | ".join(rationale), audit_hash

    return ComplianceStatus.APPROVED, "Meets all active policy constraints.", audit_hash

Step 4 — Wrap evaluation in an idempotent entry point

The driver parses, evaluates, and logs. Schema failures route to the quarantine queue rather than crashing the batch; business-rule and unexpected failures resolve to explicit, auditable statuses. Repeated execution over the same payload yields an identical audit entry.

python

def process_expenditure(raw_payload: str, policy_matrix: Dict[str, Any]) -> Dict[str, Any]:
    """Idempotent entry point: parse, validate, evaluate, log."""
    try:
        data = json.loads(raw_payload)
        record = ExpenditureRecord(**data)
        status, rationale, audit_hash = evaluate_compliance(record, policy_matrix)

        audit_entry = {
            "transaction_id": record.transaction_id,
            "status": status.value,
            "rationale": rationale,
            "audit_hash": audit_hash,
            "evaluated_at": datetime.datetime.now(datetime.timezone.utc).isoformat(),
        }
        logger.info(json.dumps(audit_entry))
        return audit_entry

    except (json.JSONDecodeError, TypeError, KeyError) as e:
        # Schema boundary breach — quarantine for reconciliation, never drop.
        logger.error(f"Schema validation failed | payload_quarantined | error: {e}")
        return {
            "status": "QUARANTINED",
            "error": str(e),
            "raw_payload_hash": hashlib.sha256(raw_payload.encode()).hexdigest(),
        }
    except ComplianceError as e:
        logger.error(f"Business rule violation | transaction_rejected | error: {e}")
        return {"status": "REJECTED", "error": str(e)}
    except Exception as e:
        logger.critical(f"Unexpected pipeline failure | fallback_triggered | error: {e}")
        return {"status": "SYSTEM_FALLBACK", "error": "Routed to compliance officer queue."}

Records that fail the schema boundary or trip the system fallback should be diverted through your Fallback Routing Protocols rather than re-attempted blindly.

Schema and field reference

The checker reads these record fields and matrix keys. Widen the rule set in the version-controlled matrix, not in code.

Field / key	Type	Constraint	Source rule
`transaction_id`	string	Non-empty; stable across retries	Ledger idempotency key
`cost_category`	string	Matches `^(equipment\|supplies\|travel\|personnel\|other)$`	NSF PAPPG budget categories
`allocation_pct`	float	`0 ≤ x ≤ 100`	2 CFR 200 cost allocation
`amount`	float	`> 0`	2 CFR 200 allowable-cost reporting
`funding_source`	string	Maps to an active award/NICRA	Institutional F&A (NICRA)
`unallowable_keywords`	list[str]	Lowercased match terms	2 CFR 200 / NSF PAPPG prohibited items
`ehs_restricted_categories`	list[str]	Lowercased category names	OSHA 29 CFR 1910 / EPA RCRA
`audit_hash`	string	64-char SHA-256 hex	Non-repudiation audit trail

Verification

Confirm a run behaved correctly before trusting its output:

Evaluate in isolation: call evaluate_compliance(record, matrix) in a REPL with a known record and assert the returned ComplianceStatus matches the gate you expect to fire.
Reproduce the hash: re-run generate_audit_hash on the same record and confirm it equals the audit_hash written to compliance_audit.log. An equal hash proves the logged decision matches the record screened.
Dry-run idempotency: call process_expenditure twice with the identical payload. The two audit entries must carry identical status and audit_hash values — only evaluated_at may differ.
Matrix version pin: confirm the matrix version hash recorded at batch start matches the deployed matrix, so an approval can always be traced to the exact rule set that produced it.

Troubleshooting

Three gotchas specific to this checker:

Audit hash differs on re-run. A volatile field entered the canonical string, or floating-point drift in amount/allocation_pct changed the digest. Normalize money fields to fixed decimal places before constructing the ExpenditureRecord, and confirm json.dumps(..., sort_keys=True) is enforced — the frozen dataclass prevents mutation after construction.
Schema quarantine loops. Repeated QUARANTINED entries for the same payload mean the upstream ERP export does not match the ExpenditureRecord field types. Confirm allocation_pct and amount arrive numeric (not string), and that all eight fields are present — a missing key raises KeyError and is correctly quarantined rather than silently defaulted.
High FALLBACK_ROUTED volume. Legacy vendor or category codes are not in the recognized enumeration. Map them through a pre-processing translation table before pipeline entry rather than loosening the regex, which would weaken the NSF PAPPG category guarantee. Approved-then-later-flagged records almost always indicate a stale matrix; reload it each batch cycle.

Frequently asked questions

Why screen against the policy matrix in code instead of a database CHECK constraint?

The matrix is a version-controlled JSON artifact, so rule changes are reviewable, diff-able, and roll back cleanly without a schema migration. Evaluation in code also lets each decision carry a human-readable agency rationale that routes straight to a compliance officer, which an opaque constraint violation cannot.

Is it safe to re-run the checker over the same expenditure batch?

Yes. evaluate_compliance is stateless and reads the matrix without mutating it, and generate_audit_hash is deterministic. Re-running the same payload produces an identical status and identical hash, so an accidental double run is harmless.

Should the checker auto-approve or correct a flagged purchase?

No. A FLAGGED or FALLBACK_ROUTED result surfaces the record for human review; it never silently approves or edits it. The pipeline adjudicates structure and policy, then hands ambiguity to a compliance officer.

Setting up automated policy compliance checks for university grants

Problem statement #

Prerequisites #

Policy and regulatory alignment #

Step-by-step implementation #

Step 1 — Define the record contract and status enumeration #

Step 2 — Compute a deterministic audit fingerprint #

Step 3 — Evaluate against the policy matrix (stateless and idempotent) #

Step 4 — Wrap evaluation in an idempotent entry point #

Schema and field reference #

Verification #

Troubleshooting #

Frequently asked questions #

Related #

Problem statement

Prerequisites

Policy and regulatory alignment

Step-by-step implementation

Step 1 — Define the record contract and status enumeration

Step 2 — Compute a deterministic audit fingerprint

Step 3 — Evaluate against the policy matrix (stateless and idempotent)

Step 4 — Wrap evaluation in an idempotent entry point

Schema and field reference

Verification

Troubleshooting

Frequently asked questions

Related