This article is a work in progress. If you have any questions, thoughts, or corrections, contact us.

Playbooks

Your loan tape: what lenders actually look at

Originator

Your loan tape: what lenders actually look at

Your loan tape is the foundational document for every quantitative decision a capital provider makes about your deal. It determines advance rates, covenant levels, pricing, and whether the deal happens at all. Most originators submit tapes that are partially usable at best. This topic covers exactly what fields are required, how to identify and fix common data quality problems, and how capital providers use the tape to reach conclusions about your portfolio.


What capital providers actually do with your tape

Understanding the workflow helps you understand what they need and why.

Step 1: Data validation (1-2 days)

  • Check completeness: what percentage of records have data in each field?
  • Check consistency: do the values make sense? (Maturity dates before origination dates, balances below zero, interest rates of 0% or 999%.)
  • Check conformity to the data dictionary: are field values using the expected codes and formats?
  • Identify outliers: loans that fall significantly outside the distribution on any dimension

Step 2: Basic stratification (2-3 days)

  • Distributions across key dimensions: loan size, rate, term, credit score, geography, origination month
  • Delinquency breakdown: current vs. 1-30 vs. 31-60 vs. 61-90 vs. 90+
  • Concentration analysis: any field where a small number of values represent a large percentage of the portfolio

Step 3: Performance analysis (3-5 days)

  • Static pool / vintage analysis: cohort-by-cohort loss and prepayment curves
  • Delinquency roll rates: what percentage of 30 DPD loans roll to 60 DPD? This predicts future losses from the current delinquency stack.
  • Loss severity analysis: for charged-off loans, what percentage of the outstanding balance was recovered?
  • CPR calculation: what percentage of the portfolio is prepaying annually?

Step 4: Stress testing and scenario analysis

  • Base case: CDR/CPR/severity assumptions derived from historical performance
  • Stress case: typically 1.5x-3x base CDR, reduced prepayment, reduced severity
  • Break-even: how bad do losses need to get before the senior investor takes a loss (for term deals) or the borrowing base is breached (for warehouses)?

Every field that’s missing or wrong in your tape either gets excluded from the analysis (reducing pool size) or generates a question that delays the process.


Required field standards by asset class

Universal required fields (all asset classes)

These fields are required for every loan tape regardless of asset class.

FieldDescriptionFormatNotes
Loan IDUnique identifier for each loanString or numericMust be unique; used to join with other data files
Origination DateDate the loan was fundedYYYY-MM-DDCritical for vintage analysis
Original BalancePrincipal balance at originationNumeric, USDNo commas or $ symbols in the cell
Current BalanceOutstanding principal as of tape dateNumeric, USDMust be ≤ Original Balance (except revolving products with capitalized fees)
Gross Interest RateNote rate / couponNumeric, percentUse 6.50, not “6.5% fixed”
Interest Rate TypeFixed or floatingStringValid values: Fixed, Floating, Step-Rate
Index (if floating)SOFR, Prime, etc.StringRequired if Interest Rate Type = Floating
Spread over index (if floating)Margin over the indexNumeric, bps or percentRequired if Interest Rate Type = Floating
Original TermTerm at origination in monthsInteger
Remaining TermRemaining months to scheduled maturityIntegerShould equal months between tape date and scheduled maturity
Scheduled Maturity DateDate of final scheduled paymentYYYY-MM-DD
Loan StatusCurrent delinquency or resolution statusCoded stringSee valid values below
Days Past DueExact days past due as of tape dateInteger (0 for current)Must be consistent with Loan Status
Tape Date / As-Of DateDate as of which all current data is reportedYYYY-MM-DDSingle date for the entire tape
Obligor StateState of obligor’s primary address2-letter abbreviation

Loan Status valid values (minimum standard):

  • Current
  • 1-30 DPD
  • 31-60 DPD
  • 61-90 DPD
  • 90+ DPD
  • Charged Off
  • Paid Off / Prepaid
  • Forbearance / Modification

Asset-class-specific required fields

Consumer unsecured (personal loans, BNPL, income share agreements)

FieldRequired?Notes
FICO Score at OriginationRequiredCurrent FICO optional but valuable
Credit Bureau at OriginationRequiredEquifax / Experian / TransUnion / Vantage
Debt-to-Income Ratio at OriginationRequiredStated or verified; flag which
Income at OriginationRequiredStated or verified; flag which
Income Verification MethodRequiredW2, bank statement, paystub, stated, self-reported
Employment StatusRequiredEmployed FT / PT / Self-employed / Retired / Student
Loan PurposeRequiredDebt consolidation / Home improvement / Medical / Other
Monthly Payment AmountRequired
Number of Prior Payments MadeRecommendedUseful for seasoning analysis
Current FICO (if refreshed)Optional but valuedSignals portfolio drift
NSF / ACH Return CountRecommendedEarly warning indicator of financial stress

Auto loans and leases

FieldRequired?Notes
FICO Score at OriginationRequired
Vehicle MakeRequired
Vehicle ModelRequired
Vehicle YearRequired
Vehicle Mileage at OriginationRequired
New / Used / CPORequired
Vehicle Value (MSRP or NADA) at OriginationRequired
LTV at OriginationRequired
Current LTV (if available)RecommendedRequires current value estimate
Purchase vs. RefinanceRequired
Dealer State (for indirect origination)Recommended

Commercial / SMB loans (SBA, revenue-based, whole business)

FieldRequired?Notes
Business NAICS CodeRequired4 or 6 digit
Business Type (entity structure)RequiredLLC / Corp / Sole Prop / Partnership
Years in BusinessRequired
Personal GuarantorRequiredYes/No; if yes, PG FICO required
PG FICO at OriginationRequired if PG = Yes
Annual RevenueRequiredTTM at origination
Revenue Verification MethodRequiredTax return / bank statement / stated
Collateral TypeRequiredUnsecured / Equipment / RE / Blanket Lien
Collateral Value at OriginationRequired if secured
SBA Guarantee PercentageFor SBA loans75% or 85%
DSCR at OriginationRequired for RE-secured

Mortgage (residential non-QM, non-agency)

FieldRequired?Notes
FICO at OriginationRequired
LTV at OriginationRequired
Combined LTV (CLTV) if subordinate financingRequired if applicable
DSCR (for investor properties)Required for DSCR loans
Property TypeRequiredSFR / Condo / 2-4 Unit / MF / Mixed Use
Property State and CountyRequired
Occupancy TypeRequiredPrimary / Second / Investment
Loan PurposeRequiredPurchase / Rate-Refi / Cash-Out Refi
Documentation TypeRequiredFull doc / Alt doc / Bank statement / DSCR / ITIN
Appraisal Value at OriginationRequired
Appraisal DateRequired
Subject to Forbearance?RequiredFlag all loans with active or recent forbearance

Equipment leases

FieldRequired?Notes
Equipment Type / NAICSRequired
Equipment Year / Make / ModelRequired
Equipment ConditionRequiredNew / Used
Equipment Value at Origination (FMV)Required
Residual Value (for operating leases)RequiredStated residual
Lease TypeRequiredFinance / Operating / TRAC
Lessee Credit Score or Business Credit RatingRequired
Lessee Years in BusinessRequired

Common data quality issues and how to fix them

Missing data

This is the most common problem and the easiest to quantify. Capital providers typically exclude loans from the eligible pool (or apply haircuts to the advance rate) if key fields are missing.

Threshold expectations:

Missing RateAcceptable?
0% missingRequired for: Loan ID, Origination Date, Current Balance, Rate, Loan Status
Under 2% missingExpected for: FICO/credit score, state, term, original balance
Under 5% missingExpected for: income, employment status, verification method, collateral value
Above 5% missing on any required fieldTriggers a data quality conversation

Illustrative pricing. See pricing disclaimer.

How to fix missing data:

  1. Identify root cause: system configuration issue (field existed but wasn’t captured), historical gap (loans originated before you had a field), or genuine borrower data gap
  2. System configuration issues: fix the system and re-export the tape
  3. Historical gaps: if data exists in original application files, it can be manually keyed; budget 15-30 minutes per loan for a remediation project
  4. Genuine data gaps: flag these as missing rather than attempting to estimate or fill with averages

Never fill missing credit scores with median values, fill missing states with a “most common state,” or otherwise impute data without clearly labeling it as estimated. If a capital provider discovers imputed data presented as actual, it destroys credibility for the entire tape.


Logical inconsistencies

These are fields where values are present but make no sense relative to other fields. Run these checks before sending your tape.

CheckLogic ErrorFrequencyFix
Maturity date < origination dateImpossibleUncommon but occursFix at source
Current balance > original balance (for amortizing loans)Impossible without explanationCommon for early-stage systemsAdd a flag for neg-am or fee capitalization if legitimate; fix if data error
Rate = 0% on a performing loanAlmost certainly a data errorCommonCheck if rate is stored as decimal (0.065 vs. 6.5)
Rate > 36% on a consumer loanPossible but requires explanationUncommonVerify; flag for state usury compliance review
Days past due inconsistent with loan status (45 DPD but status = “Current”)InconsistencyCommonReconcile status coding to DPD field
Remaining term > original termImpossibleOccasionalFix at source
Charge-off date before origination dateImpossibleOccasionalFix at source
State code not a valid US state abbreviationData entry errorUncommonFix at source

Python (pandas) or SQL are the most efficient tools for large tapes. Excel works for tapes under roughly 50,000 rows. Build a validation script you can run on every tape export before sending.


Inconsistent formatting

Formatting inconsistencies don’t make the data wrong, but they require cleaning before analysis, which capital providers interpret as lack of operational discipline.

FieldCommon InconsistencyStandard Format
DatesMix of MM/DD/YYYY, YYYY-MM-DD, “Jan 15, 2023”YYYY-MM-DD (ISO 8601)
BalancesSome cells have ”$” and commas, others are plain numbersNumeric, no symbols
Interest ratesSome as decimals (0.065), some as percentages (6.5)Consistent within the tape; state which in data dictionary
StateMix of full state names and abbreviations2-letter abbreviation
Boolean fieldsMix of “Y/N”, “Yes/No”, “1/0”, “True/False”Pick one and use it consistently
Loan statusMix of abbreviated and spelled-out codesConsistent coded values; documented in data dictionary
Null representationMix of blank cells, “N/A”, “NULL”, “None”, “0”Blank for missing numeric; consistent text for missing string

Stale tape dates

Capital providers want the tape to be as current as possible. A tape that’s 3+ months old creates problems: performance data is stale, current balances don’t reflect recent payoffs, and delinquency status may have changed materially.

Standards by use:

  • Warehouse borrowing base calculations: tape should be no more than 30 days old (many facilities require monthly or even bi-weekly updates)
  • Term ABS closing pool: pool cut-off is typically at closing or within 30-60 days of pricing
  • Screening and diligence: a tape up to 90 days old is usually acceptable for initial analysis, with the expectation that a fresh tape will be delivered before commitment

Always label the as-of date in your file name. Capital providers will assume the tape is current. When they discover it isn’t, it creates trust issues.


What drives analytical conclusions from the tape

Understanding what capital providers calculate from your tape helps you anticipate what they’ll ask and how to present your portfolio.

Loss analysis

Cumulative Net Loss Rate (CNL):

  • Calculated as: cumulative charge-offs minus recoveries, divided by original pool balance of the cohort
  • Capital providers build this for each origination cohort and plot it against age-in-months
  • A rising CNL curve that’s flattening suggests losses are seasoning off and the portfolio is stabilizing
  • A CNL curve still rising sharply at 18-24 months suggests losses haven’t peaked yet

What they’re looking for:

  • Absolute level: how does your CNL compare to market benchmarks for your asset class and credit tier?
  • Shape: is the curve consistent across vintages, or are recent vintages performing worse?
  • Slope: is there acceleration in the curve? That’s a red flag.

Gross Default Rate / Constant Default Rate (CDR):

  • CDR is an annualized default rate (how fast the portfolio is defaulting right now)
  • Formula: CDR = 1 - (1 - MDR)^12, where MDR = monthly default rate
  • Capital providers use CDR to extrapolate expected future losses from your current delinquency pipeline

Loss Severity:

  • For charged-off loans: what percentage of the outstanding balance was NOT recovered?
  • Recovery data is required for this calculation. If you don’t track recovery collections separately from new origination payments, fix that.
  • Capital providers typically stress severity 10-25% above your historical average.

Prepayment analysis

Constant Prepayment Rate (CPR):

  • Annualized rate at which the pool is prepaying voluntarily
  • Formula: CPR = 1 - (1 - SMM)^12, where SMM = single monthly mortality (prepayment rate in one month)
  • High CPR shortens WAL and can be positive (lower risk) or negative (you lose spread income on prepaid loans)

What capital providers look for:

  • Is your CPR consistent with your loan product? Is a 36-month loan prepaying as expected, or are people prepaying immediately (suggesting refinancing out)?
  • Does CPR spike in certain market conditions? When rates drop, do your borrowers refinance?
  • For revolving facilities: is the prepayment rate consistent with the replenishment rate you’re representing?

Delinquency roll rates

Roll rates predict future losses from the current delinquency pipeline.

Typical analysis:

  • What percentage of 1-30 DPD loans roll to 31-60 DPD in the next month? (e.g., 25%)
  • What percentage of 31-60 DPD loans roll to 61-90 DPD? (e.g., 40%)
  • What percentage of 61-90 DPD loans roll to 90+ DPD? (e.g., 60%)
  • What percentage of 90+ DPD loans roll to charge-off? (e.g., 80%)

What this tells capital providers: Current $1M of 30 DPD loans implies approximately $1M x 25% x 40% x 60% x 80% = $48K of expected future charge-offs just from that bucket. If your current DQ stack is large, roll rates quantify how much of that will ultimately charge off.

Provide roll rate data if you have it. If you don’t, capital providers will estimate from your monthly tape history if you provide multiple months. The more monthly snapshots you provide, the better the analysis they can do.


Stratification tables

Every capital provider will build these and review them before their investment committee. Build them yourself and include them in your data room; it shows sophistication and pre-empts questions.

Standard strat tables for consumer and auto:

  • FICO score distribution: <600, 600-639, 640-659, 660-679, 680-699, 700-719, 720-749, 750+
  • LTV distribution (secured): <60%, 60-70%, 70-80%, 80-90%, >90%
  • DTI distribution: <30%, 30-40%, 40-45%, 45-50%, >50%
  • Loan size distribution
  • Term distribution
  • Geographic distribution (by state, with comparison to population distribution)
  • Origination vintage distribution (by quarter)
  • Delinquency status distribution

Standard strat tables for commercial/SMB:

  • Business age: <1 year, 1-2 years, 2-5 years, 5-10 years, 10+
  • NAICS code distribution (by industry sector)
  • Revenue size: <$250K, $250K-$500K, $500K-$1M, $1M-$5M, $5M+
  • Loan size distribution
  • Geographic distribution
  • Loan purpose distribution

Each row in the strat table shows a bucket, the count of loans, the balance of loans, and the percentage of total pool balance. Capital providers want both count and balance percentages because large loans can look very different from small loans in the same bucket.


Concentration analysis

Capital providers set concentration limits in transaction documents based on what they see in your pool. Identify your concentrations before entering diligence; you don’t want to discover a 40% single-state concentration during negotiation.

Concentration TypeTypical LimitWhat to Do If You’re Above It
Single state15-25% depending on asset classDiversify over time; expect haircuts on excess
Single employer (salary-deduction loans)10-15%Flag proactively; negotiate a reasonable limit
Single industry/NAICS (commercial)20-30%Explain the concentration; is it your core market?
Single borrower1-5% depending on asset classIneligible for most facilities above that threshold
Channel concentration (single origination channel)40-50% for some structuresDemonstrate channel diversification or explain the relationship
Vintage concentration (>30% originated in last 3 months)Triggers seasoning questionsExplain your origination growth trajectory

Presenting static pool and vintage performance

The static pool analysis is your single most persuasive document for credit-quality conversations.

The required format

Each origination cohort should be a separate row. Cohorts are typically monthly or quarterly. Columns are age-in-months: 1, 3, 6, 9, 12, 18, 24, 30, 36.

Include three tables minimum:

  1. Cumulative Net Loss Rate (CNL) by vintage
  2. 30+ Day Delinquency Rate by vintage (as a percentage of then-outstanding balance)
  3. Cumulative Prepayment Rate (CPR) by vintage

Add a fourth table for deeper analysis: 4. Original pool balance by vintage (shows the analysis is on meaningful pool sizes, not tiny cohorts)


Visual presentation

Capital providers and their IC committees respond much better to charts than tables for performance data.

Essential charts:

  • CNL curves: Each cohort as a line, age-in-months on the x-axis, CNL % on the y-axis. Recent cohorts should track below or on top of older cohorts (recent better than older = improving underwriting or stable environment).
  • DQ curves: Same format as CNL. Delinquency curves peak and then decline in well-performing portfolios; a DQ curve that keeps rising signals future losses.
  • Vintage performance summary: A heat map or table showing CNL at 12 months, 24 months, and 36 months for each vintage, color-coded green/yellow/red vs. expected performance.

The narrative

Don’t just send tables and charts. Include a 1-2 page narrative explaining:

  • What the performance shows at a high level (“Our 2022-2024 vintages are performing within 10-15% of original underwriting expectations”)
  • Any specific cohorts that underperformed and why (“Our Q2 2022 vintage was our first cohort using the direct mail channel, which we discontinued after observing elevated 90+ DPD rates. No subsequent vintages use that channel.”)
  • How you’ve responded to any negative performance signals (“After observing elevated DQ in the 600-640 FICO band, we tightened the minimum FICO score from 600 to 620 in August 2023”)

This narrative does two things: it shows that you monitor your own performance closely, and it pre-empts the questions they were going to ask anyway.


Tape delivery mechanics

File format:

  • CSV is preferred for large tapes (faster to process; compatible with all tools)
  • Excel (.xlsx) is acceptable for tapes under 100,000 rows; use raw data only, no merged cells, no colors, no formulas
  • Pipe-delimited or tab-delimited are acceptable if clearly documented
  • Do not deliver as PDF, as a formatted report, or as a data cube that requires a specific tool to open

File naming convention: [Originator]_LoanTape_YYYYMMDD.csv — for example, AcmeLending_LoanTape_20260401.csv. If delivering multiple files, include an index file listing all files and their contents.

Delivery method:

  • Data room (preferred): upload to the secure folder; do not email sensitive files
  • Email: acceptable only for files with no PII and only aggregate data
  • FTP/SFTP: acceptable but less common now that data rooms are standard

Tape delivery checklist

Run this before delivering any tape to a capital provider.

Completeness

  • All required universal fields present (see universal field table above)
  • Asset-class-specific required fields present
  • Null/missing rate below threshold on all required fields
  • As-of date documented in file name and data dictionary

Consistency

  • Maturity dates are after origination dates for all records
  • Current balance does not exceed original balance (or explained)
  • Interest rates within reasonable range (0.01-36% for consumer; 4-25% for commercial)
  • Days past due consistent with Loan Status field
  • Remaining term consistent with tape date and maturity date

Formatting

  • Dates in YYYY-MM-DD format throughout
  • Numeric fields are numeric (no $, commas, or % symbols in cells)
  • Boolean/flag fields use consistent values (not a mix of Y/N and Yes/No)
  • State field uses 2-letter abbreviations
  • Loan status field uses consistent coded values matching the data dictionary

PII

  • No borrower names in the file
  • No Social Security Numbers
  • No full addresses (state only is fine)
  • No phone numbers, email addresses, or other direct identifiers

Documentation

  • Data dictionary prepared and current (matches all fields in the tape)
  • Any known data quality issues flagged in a cover note
  • Tape reconciles to servicer records (loan count and total balance match)