API

CollegeData.FYI now exposes two public surfaces: simple no-auth JSON endpoints for agents and command-line tools, and the full read-only PostgREST API at https://api.collegedata.fyi. Every page on this site is built from the same endpoints documented below: archived Common Data Set documents, structured CDS fields, source-labeled NCES/IPEDS baseline facts, and curated federal Scorecard context.

Simple endpoints

Start here for MCP tools, CLIs, notebooks, and quick integrations. These endpoints do not require the Supabase anon key and return source labels next to facts so agents can cite values without guessing where they came from.

curl 'https://www.collegedata.fyi/api/schools/search?q=mit'

curl 'https://www.collegedata.fyi/api/schools/mit/facts?categories=admissions,cost,outcomes'

curl 'https://www.collegedata.fyi/api/schools/mit/sources'

curl 'https://www.collegedata.fyi/api/compare?schools=mit,yale,university-of-chicago&categories=admissions,cost,outcomes'

curl 'https://www.collegedata.fyi/api/fields'

curl 'https://www.collegedata.fyi/openapi.json'

The minimal MCP server and CLI live in the repo under packages/mcp-server and packages/cli. Both wrap the simple endpoints rather than reimplementing query logic.

Runbook

1. Smoke-test the API

Start with search, facts, and compare. If these three work, the friendly API surface is healthy enough for most agent and CLI use.

curl 'https://www.collegedata.fyi/api/schools/search?q=mit'
curl 'https://www.collegedata.fyi/api/schools/mit/facts?fields=avg_net_price,graduation_rate_6yr'
curl 'https://www.collegedata.fyi/api/compare?schools=mit,yale,university-of-chicago&fields=acceptance_rate,avg_net_price'

2. Run the CLI

From a checkout of the repo, the CLI can point at production, preview, or localhost with COLLEGEDATA_API_BASE.

COLLEGEDATA_API_BASE=https://www.collegedata.fyi node packages/cli/bin/collegedata.js search mit
COLLEGEDATA_API_BASE=https://www.collegedata.fyi node packages/cli/bin/collegedata.js facts mit --categories admissions,cost
COLLEGEDATA_API_BASE=https://www.collegedata.fyi node packages/cli/bin/collegedata.js compare mit yale university-of-chicago --format csv

3. Connect an MCP client

The MCP server is read-only and uses the same friendly API. Configure a client to run the server command from the repository root.

{
  "mcpServers": {
    "collegedata": {
      "command": "node",
      "args": ["packages/mcp-server/bin/collegedata-mcp.js"],
      "env": {
        "COLLEGEDATA_API_BASE": "https://www.collegedata.fyi"
      }
    }
  }
}

4. Use snapshots for local work

Use pinned snapshot paths for reproducible notebooks and the latest alias for quick experiments.

curl 'https://www.collegedata.fyi/snapshots/latest/manifest.json'
curl 'https://www.collegedata.fyi/snapshots/latest/schools.jsonl'
curl 'https://www.collegedata.fyi/snapshots/latest/school_facts.jsonl'

5. Troubleshoot source gaps

A missing value is usually one of three things: the school has no public CDS, the field is not part of the V1 friendly dictionary, or the source row is intentionally withheld because the projected value failed a sanity check. The sources endpoint shows the document and federal release context behind the page.

curl 'https://www.collegedata.fyi/api/schools/mit/sources'
curl 'https://www.collegedata.fyi/api/fields?category=admissions'
curl 'https://www.collegedata.fyi/llms.txt'

Raw PostgREST authentication

All requests require a Supabase anon key passed as both an apikey query parameter and an Authorization bearer header. The anon key is public and grants read-only access to the published views below.

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJzdXBhYmFzZSIsInJlZiI6ImlzZHV3bXlndm1kb3pocHZ6YWl4Iiwicm9sZSI6ImFub24iLCJpYXQiOjE3NzYxMDk3NTksImV4cCI6MjA5MTY4NTc1OX0.fYZOIHyrOWzidgc-CVxWCY5Fe9pQk12-6YjDIS6y9qs

Resources

GET /rest/v1/cds_manifesttry it →

One row per archived CDS document. Joins schools, source URLs, format detection, and extraction status. Carries ipeds_id so federal-data joins are one query away.

school_idschool_nameipeds_idcanonical_yearsource_formatsource_urlextraction_statusdata_quality_flag
Show all 18 fields →
document_idsub_institutionalcds_yeardetected_yearsource_storage_pathparticipation_statuslatest_canonical_artifact_iddiscovered_atlast_verified_atremoved_at
GET /rest/v1/cds_artifactstry it →

Raw extraction artifacts keyed by document. Most consumers should prefer cds_fields for field-level queries or the selected-result helper semantics documented below.

document_idkindproducernotescreated_at
Show all 10 fields →
idproducer_versionschema_versionstorage_pathsha256
GET /rest/v1/cds_fieldstry it →

242,800 normalized field rows from selected 2024-25+ extraction results. Use this for direct canonical-field queries across schools; derived metrics such as acceptance_rate live in school_browser_rows/browser-search.

school_idschool_namecanonical_yearfield_idcanonical_metricvalue_numvalue_textvalue_kindsub_institutional
Show all 21 fields →
document_idipeds_idyear_startschema_versionvalue_boolvalue_statussource_formatproducerproducer_versiondata_quality_flagarchive_urlupdated_at
GET /rest/v1/school_browser_rowstry it →

524 primary 2024-25+ rows across 396 schools, refreshed Jun 4, 2026. This is the curated serving layer for the website browser, CSV exports, and the per-school academic positioning and admission strategy cards.

school_idschool_namecanonical_yearappliedadmittedacceptance_rateyield_rateed_offereded_applicantsed_admittedea_offeredavg_net_pricesat_composite_p50
Show all 58 fields →
document_idsub_institutionalipeds_idyear_startschema_versionsource_formatproducerproducer_versiondata_quality_flagarchive_urlenrolled_first_yearundergrad_enrollment_scorecardscorecard_data_yearretention_ratepell_ratesat_submit_rateact_submit_ratesat_composite_p25sat_composite_p75sat_ebrw_p25sat_ebrw_p50sat_ebrw_p75sat_math_p25sat_math_p50sat_math_p75act_composite_p25act_composite_p50act_composite_p75ed_has_second_deadlineea_restrictivewait_list_policywait_list_offeredwait_list_acceptedwait_list_admittedc711_first_gen_factorc712_legacy_factorc713_geography_factorc714_state_residency_factorc718_demonstrated_interest_factorapp_fee_amountapp_fee_waiver_offeredadmission_strategy_card_qualityfederal_baseline_availablefederal_source_modeupdated_at
GET /rest/v1/school_facts_unifiedtry it →

Source-labeled NCES/IPEDS baseline facts for in-scope institutions. This is the public serving view for the IPEDS coverage layer; raw IPEDS JSON rows are intentionally not exposed.

school_idschool_nameipeds_idfield_keyfield_labeldisplay_valuequality_flagdefinition_alignmentsource_tablesource_variable
Show all 29 fields →
citystatein_scopecollection_yeardata_yearvalue_numericvalue_textvalue_labelunitcohortpopulationsource_layersource_titlerelease_typeimputation_flagimputation_labeldefinition_notedisplay_groupcreated_at
GET /rest/v1/ipeds_current_factstry it →

Latest public IPEDS fact per public ipeds_id and field key. Prefers newer data years, then final over provisional over preliminary within the same data year. This view is backed by a materialized serving cache; use school_facts_unified for school-page display because it already joins institution names, slugs, and in-scope filtering.

ipeds_idschool_idfield_keyfield_labelvalue_numericvalue_labelrelease_typedefinition_alignmentquality_flag
Show all 27 fields →
release_idunitidcollection_yeardata_yearvalue_textunitcohortpopulationsource_tablesource_variablesource_titleimputation_flagimputation_labeldefinition_notedisplay_grouppublic_visiblecreated_atrn
GET /rest/v1/ipeds_factstry it →

Curated long-form NCES/IPEDS facts. This is the source-labeled fact table behind ipeds_current_facts and school_facts_unified; every row keeps release, source variable, imputation, and CDS-definition alignment metadata. For historical reads, filter by ipeds_id, field_key, and data_year rather than raw unitid.

ipeds_idschool_idcollection_yearfield_keyfield_labelvalue_numericvalue_labelsource_tablesource_variablerelease_type
Show all 26 fields →
release_idunitiddata_yearvalue_textunitcohortpopulationsource_titleimputation_flagimputation_labelquality_flagdefinition_alignmentdefinition_notedisplay_grouppublic_visiblecreated_at
GET /rest/v1/ipeds_releasestry it →

Official NCES/IPEDS release-load provenance. Tracks collection year, data year, release type, release date, source URLs, checksums, and loader notes.

collection_yeardata_yearrelease_typerelease_datemetadata_urlaccess_url
Show all 14 fields →
idsource_page_urlsource_page_sha256metadata_sha256access_sha256downloaded_atnotescreated_at
GET /rest/v1/ipeds_tablestry it →

IPEDS table metadata from the official Tablesdoc workbook plus loader-observed row counts and source checksums.

table_namesurvey_componenttable_titledata_urlrow_count
Show all 15 fields →
release_idyear_coveragetable_numberdescriptiontable_releasetable_release_datedictionary_urlsource_sha256loaded_atcreated_at
GET /rest/v1/ipeds_columnstry it →

Variable metadata from the official IPEDS Tablesdoc workbook, including long descriptions and imputation-variable pointers.

table_namevar_namevar_titledata_typeimputation_var
Show all 22 fields →
release_idsurvey_componenttable_numbertable_titlevar_numbervar_orderfield_widthformatmulti_recordhas_rvfile_numbersection_numberlong_descriptionvar_sourcefile_titlesection_titlecreated_at
GET /rest/v1/ipeds_value_labelstry it →

Categorical code labels from official IPEDS metadata, including reporting-status and imputation-code labels.

table_namevar_namecode_valuevalue_labelfrequency
Show all 10 fields →
release_idpercentvalue_ordervar_titlecreated_at
GET /rest/v1/school_merit_profiletry it →

Latest primary 2024-25+ CDS Section H merit and need-aid facts per school, joined to selected College Scorecard affordability and outcome fields. Used by the school-page merit profile.

school_idschool_namecanonical_yearfirst_year_ft_studentsnon_need_aid_recipients_first_year_ftavg_non_need_grant_first_year_ftnon_need_aid_share_first_year_ftavg_net_pricegraduation_rate_6yr
Show all 56 fields →
document_idsub_institutionalipeds_idyear_startschema_versionsource_formatproducerproducer_versiondata_quality_flagarchive_urlall_ft_undergradsneed_grants_totalnon_need_grants_totalaid_recipients_first_year_ftaid_recipients_all_ftavg_aid_package_first_year_ftavg_aid_package_all_ftavg_need_grant_first_year_ftavg_need_grant_all_ftavg_need_self_help_first_year_ftavg_need_self_help_all_ftnon_need_aid_recipients_all_ftavg_non_need_grant_all_ftnon_need_aid_share_all_ftinstitutional_need_aid_nonresidentinstitutional_non_need_aid_nonresidentavg_international_aidinstitutional_aid_academicscds_merit_core_countcds_merit_field_countmerit_profile_qualityscorecard_data_yearearnings_6yr_medianearnings_8yr_medianearnings_10yr_medianearnings_10yr_p25earnings_10yr_p75median_debt_completersmedian_debt_monthly_paymentnet_price_0_30knet_price_30k_48knet_price_48k_75knet_price_75k_110knet_price_110k_pluspell_grant_ratefederal_loan_rateretention_rate_ft
GET /rest/v1/cds_documentstry it →

Raw archive table — one row per (school, sub-institution, year). Most consumers should prefer cds_manifest.

idschool_idipeds_idcds_yeardetected_yearparticipation_statussource_sha256
Show all 20 fields →
school_namesub_institutionalsource_urlsource_formatsource_page_countsource_provenanceextraction_statusdata_quality_flagdiscovered_atlast_verified_atremoved_atcreated_atupdated_at
GET /rest/v1/cds_scorecardtry it →

CDS manifest left-joined with the federal College Scorecard. One row per archived CDS document with post-graduation earnings, debt, net price by income bracket, completion rate, and retention attached. Currently joined to Scorecard 2022-23.

school_nameipeds_idcds_yearearnings_10yr_medianmedian_debt_completersavg_net_pricenet_price_0_30kgraduation_rate_6yrpell_grant_rate
Show all 33 fields →
document_idschool_idsource_formatsource_storage_pathextraction_statusdata_quality_flaglatest_canonical_artifact_idscorecard_data_yearearnings_10yr_p25earnings_10yr_p75median_debt_monthly_paymentnet_price_30k_48knet_price_48k_75knet_price_75k_110knet_price_110k_plusgrad_rate_pellrepayment_rate_3yrdefault_rate_3yrfederal_loan_ratefirst_generation_sharemedian_family_incomeretention_rate_ftendowment_endinstructional_expenditure_fte
GET /rest/v1/scorecard_summarytry it →

Curated federal College Scorecard subset, one row per IPEDS UNITID (6,322 institutions, not just CDS-archived ones). Refreshed Apr 21, 2026 after the 2022-23 Scorecard load. For per-program earnings, race-stratified completion, or other fields beyond the curated subset, query Scorecard directly.

ipeds_idschool_namescorecard_data_yearearnings_10yr_medianmedian_debt_completersavg_net_pricegraduation_rate_6yrendowment_end
Show all 42 fields →
refreshed_atearnings_6yr_medianearnings_8yr_medianearnings_10yr_p25earnings_10yr_p75median_debt_noncompletersmedian_debt_monthly_paymentcumulative_debt_p90median_debt_pellnet_price_0_30knet_price_30k_48knet_price_48k_75knet_price_75k_110knet_price_110k_plusgraduation_rate_4yrgraduation_rate_8yrgrad_rate_pelltransfer_out_raterepayment_rate_3yrdefault_rate_3yrenrollmentpell_grant_ratefederal_loan_ratefirst_generation_sharemedian_family_incomefemale_shareretention_rate_ftcarnegie_basiclocalehistorically_blackpredominantly_blackhispanic_servinginstructional_expenditure_ftefaculty_salary_avg

Examples

Fetch federal baseline facts for a no-CDS school

Federal baseline facts come from school_facts_unified. Keep the release type, source table/variable, quality flag, and definition alignment visible if you reuse these values; they are NCES/IPEDS facts, not school-published CDS fields unless the alignment says so.

curl 'https://api.collegedata.fyi/rest/v1/school_facts_unified?school_id=eq.goshen-college&select=school_id,school_name,field_label,display_value,release_type,collection_year,source_table,source_variable,quality_flag,definition_alignment&order=display_group,field_key' \
  -H 'apikey: <anon key>' \
  -H 'Authorization: Bearer <anon key>'

Fetch a historical IPEDS time series

Historical IPEDS queries are fastest when they use the public ipeds_id key, one or more field_key values, and a bounded data_year range. Avoid filtering raw unitid unless you also know the matching index exists.

curl 'https://api.collegedata.fyi/rest/v1/ipeds_facts?ipeds_id=eq.110635&field_key=in.(retention_rate_full_time,graduation_rate_6yr)&data_year=gte.2019&data_year=lte.2024&select=ipeds_id,data_year,field_key,value_numeric,source_table,source_variable&order=data_year.asc' \
  -H 'apikey: <anon key>' \
  -H 'Authorization: Bearer <anon key>'

Search the curated school browser

The browser uses an Edge Function so latest-per-school ranking can account for required fields and null answerability. Percent and rate values are stored as fractions from 0 to 1.

curl 'https://api.collegedata.fyi/functions/v1/browser-search' \
  -H 'apikey: <anon key>' \
  -H 'Authorization: Bearer <anon key>' \
  -H 'content-type: application/json' \
  --data '{"mode":"latest_per_school","variant_scope":"primary_only","min_year_start":2024,"filters":[{"field":"acceptance_rate","op":"<=","value":0.1}],"page_size":10}'

Fetch academic positioning data for one school

The academic positioning card reads the already-public school_browser_rows resource. SAT/ACT submit rates are stored as fractions, and the card links to its methodology instead of exposing a separate scoring endpoint.

curl 'https://api.collegedata.fyi/rest/v1/school_browser_rows?school_id=eq.bowdoin&select=school_id,school_name,canonical_year,acceptance_rate,sat_submit_rate,act_submit_rate,sat_composite_p25,sat_composite_p50,sat_composite_p75,act_composite_p25,act_composite_p50,act_composite_p75' \
  -H 'apikey: <anon key>' \
  -H 'Authorization: Bearer <anon key>'

Fetch admission strategy data for one school

Admission strategy fields are also served from school_browser_rows. ED counts are published when the CDS reports them; EA is limited to offered/restrictive flags because CDS C.22 does not include EA applicant or admit counts. The card methodology is documented at /methodology/admission-strategy.

curl 'https://api.collegedata.fyi/rest/v1/school_browser_rows?school_id=eq.bowdoin&select=school_id,school_name,canonical_year,applied,admitted,yield_rate,ed_offered,ed_applicants,ed_admitted,ed_has_second_deadline,ea_offered,ea_restrictive,wait_list_policy,wait_list_offered,wait_list_accepted,wait_list_admitted,c711_first_gen_factor,c712_legacy_factor,c718_demonstrated_interest_factor,app_fee_amount,app_fee_waiver_offered,admission_strategy_card_quality' \
  -H 'apikey: <anon key>' \
  -H 'Authorization: Bearer <anon key>'

Fetch merit-aid context for one school

Merit profile data comes from school_merit_profile, a latest primary CDS Section H view joined to Scorecard affordability and outcome fields. H2A non-need award values are source-reported institutional facts, not personalized price estimates. The card methodology is documented at /methodology/merit-profile.

curl 'https://api.collegedata.fyi/rest/v1/school_merit_profile?school_id=eq.bowdoin&select=school_id,school_name,canonical_year,first_year_ft_students,non_need_aid_recipients_first_year_ft,avg_non_need_grant_first_year_ft,non_need_aid_share_first_year_ft,avg_net_price,graduation_rate_6yr,earnings_10yr_median' \
  -H 'apikey: <anon key>' \
  -H 'Authorization: Bearer <anon key>'

List the most recent year for every school

curl 'https://api.collegedata.fyi/rest/v1/cds_manifest?select=school_id,school_name,canonical_year&order=canonical_year.desc&limit=10' \
  -H 'apikey: eyJhbGciOiJIUzI1NiIsInR5…' \
  -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5…'

Fetch all archived years for one school

curl 'https://api.collegedata.fyi/rest/v1/cds_manifest?school_id=eq.harvard-university&select=canonical_year,source_format,extraction_status' \
  -H 'apikey: <anon key>' \
  -H 'Authorization: Bearer <anon key>'

Fetch extracted field values for a document

The selected extraction contract chooses the deterministic canonical artifact first. For Tier 4 Docling extracts, the LLM fallback cleaned row can fill gaps, but deterministic values win conflicts. Raw consumers can reproduce that behavior by fetching kind=eq.canonical plus rows with producer=eq.tier4_llm_fallback and overlaying the canonical values on top.

curl 'https://api.collegedata.fyi/rest/v1/cds_artifacts?document_id=eq.<uuid>&kind=eq.canonical&select=notes' \
  -H 'apikey: <anon key>' \
  -H 'Authorization: Bearer <anon key>'

JavaScript client

The same @supabase/supabase-js client this site uses works against the public API:

import { createClient } from "@supabase/supabase-js";

const supabase = createClient(
  "https://api.collegedata.fyi",
  "<anon key>"
);

const { data } = await supabase
  .from("cds_manifest")
  .select("school_name, canonical_year")
  .eq("extraction_status", "extracted")
  .limit(20);

Source documents

Original CDS files are hosted on Supabase Storage. Once you have a manifest row, build the public URL as https://api.collegedata.fyi/storage/v1/object/public/sources/<source_storage_path>. Every file is content-addressed by SHA-256.

Schema and licensing

Field IDs follow the canonical 1,105-field schema derived from the CDS Initiative's 2025-26 XLSX template. The full schema is checked into the repo at schemas/. The dataset is MIT-licensed; the underlying CDS documents are owned by their respective institutions and reproduced here under their public-document status.

Found something missing or wrong? Open an issue on GitHub, or browse the school directory.