Data sources
NYC Dept. of Buildings
DOB Complaints Received
dataset: eabe-havv
Every complaint filed with the DOB since 2007, including category, status, inspection dates, and disposition. Primary dataset behind the DOB risk level and neighborhood percentile ranking.
NYC Housing Preservation & Development
HPD Housing Maintenance Violations
dataset: wvxf-dwi5
Formally issued violations for housing maintenance code breaches. Classified by severity: Class C (immediately hazardous), B (hazardous), A (non-hazardous).
NYC Housing Preservation & Development
HPD Housing Maintenance Complaints
dataset: ygpa-z7cr
Complaints filed directly by tenants about housing conditions like heat, hot water, pests, mold, leaks, and more. Classified as Immediate Emergency, Emergency, or Non Emergency.
NYC Dept. of City Planning
NYC Building Footprints
dataset: 5zhs-2jue
Building polygons with roof height and footprint area, used to estimate total building scale for size-normalized comparisons, plus centroids for map placement.
NYC Dept. of City Planning
Neighborhood Tabulation Areas
dataset: NTA2020
2020 NTA polygon boundaries used for point-in-polygon assignment. Each building is placed in exactly one NTA, which defines its peer group for percentile ranking.
Data processing
Download & normalize
Each sync fetches only records filed since the previous run via the Socrata JSON API. Column names are mapped from the raw DOB headers (e.g., Date Entered → date_entered), date columns are parsed, and borough is derived from the first digit of each building's BIN (1 = Manhattan, 2 = Bronx, 3 = Brooklyn, 4 = Queens, 5 = Staten Island).
Deduplication
Some rows appear more than once in the raw export (e.g., when a complaint status is updated). Duplicates are resolved by keeping the last occurrence of each unique complaint_number, which reflects the most recent known status.
BIN validation
BINs consisting entirely of zeros (e.g., 0000000) are treated as missing and excluded from scoring. Complaints with no valid BIN cannot be attributed to a specific building.
Neighborhood assignment
Each building is assigned to an NTA via point-in-polygon using NYC's 2020 NTA boundaries. A spatial index (STRtree) is used for efficient bulk matching. Buildings outside all NTA polygons — typically those missing or slightly off their recorded coordinates — are excluded from neighborhood comparisons but still scored.
DOB complaint priority
Each of the 254 DOB complaint category codes is assigned a priority tier based on the DOB's own classification system (rev. 09/21). When a complaint's category is unknown or absent, it defaults to Priority C.
Collapse risk, falling debris, blocked egress, gas leaks, elevator accidents
Illegal work in progress, no permit, SRO conversion, sprinkler defects
Zoning non-compliance, certificate of occupancy issues, failure to maintain
Routine inspections, contractor sign absent, inter-agency referrals
HPD violation severity
HPD Housing Maintenance Code violations are classified into four classes by severity. The weighted violation sum uses the same recency multipliers as the DOB weighted sum (see Building size normalization below), so recent serious violations weigh more than old minor ones.
Lead paint, mold, heat failure, pest infestation, structural hazard
Broken locks, defective plumbing, missing smoke detectors, damaged floors
Peeling paint (non-lead), minor repairs, cosmetic defects
Administrative notices, permit-related items
HPD tenant complaint urgency
HPD tenant complaints are classified by urgency when filed. Because complaints are typically closed once an inspector visits or a violation is issued, raw open counts understate the building's history. The weighted complaint sum captures the full record — with higher weight for urgent and recent complaints.
No heat in winter, gas leak, sewage backup, structural collapse risk
Mold, pest infestation, water leak, broken elevator
Cosmetic damage, minor repairs, general maintenance
Building size normalization
A 200-unit tower will naturally accumulate more complaints than a four-unit brownstone. Raw counts penalize larger buildings unfairly. To make comparisons meaningful, all weighted sums are divided by an estimate of building scale before peer ranking. Since we don't have the exact unit count for each building, size is estimated by building footprint and height on building.
Estimated scale
Footprint area (sq ft) from the building polygon multiplied by estimated floors (roof height ÷ 12 ft per floor). This approximates total floor area without needing unit counts.
Complaint density
Weighted complaint or violation sum divided by estimated scale, scaled to “per 10,000 sq-ft-floors.” A small building and a large building with proportional complaint histories get the same density.
Recency multiplier
Applied to all three datasets (DOB, HPD violations, HPD complaints). Complaints with no date recorded contribute nothing to the weighted sum.
Size-normalized percentile
Each building's density is ranked via PERCENT_RANK() within its NTA, separately for HPD violations, HPD complaints, and DOB complaints. Buildings without footprint or height data receive a raw count percentile instead. A density percentile of 20 means the building has fewer weighted complaints per unit of scale than 80% of its residential neighbors.
Risk level
A building's neighborhood percentile is mapped to a risk level label shown on building pages and the map. The label reflects how the building compares to residential peers within the same neighborhood — not citywide.
“Insufficient data” and “Not comparable” are handled separately — see below.
Special cases
Insufficient data
Buildings with fewer than 10 total complaints and less than 2 years of complaint history cannot be reliably ranked. They are excluded from percentile comparisons.
Not comparable
Buildings in non-residential NTAs — parks, airports, cemeteries, and similar areas (NTA type ≠ 0) — are excluded from percentile ranking because there are no meaningful residential peers to compare against.
Neighborhood comparisons
Neighborhood percentile
Percentile comparisons are neighborhood-relative, not absolute — a building is compared only to residential peers in its own NTA. Within each NTA, buildings are ranked by weighted complaint density from lowest to highest. A building at the 80th percentile has higher weighted complaint density than 80% of its residential peers — meaning it received relatively more or more serious complaints. Percentiles are computed independently per NTA, so the same density may rank high in one neighborhood and low in another.
Serious complaint rate
Priority A and B complaints per year, averaged over the last 10 years (minimum 1 year). This rate is also percentile-ranked within each NTA and surfaced in the “Severity” insight card on building pages.
Trend
Complaint trend compares the average annual rate of the last 2 years against the 3 years before that. A building is “worsening” if the recent rate exceeds the prior rate by more than 1 complaint per year, and “improving” if it is more than 1 lower. The same algorithm is applied independently to DOB complaints and HPD tenant complaints.
Leaderboards
The leaderboard pages rank buildings by complaint activity in the last 2 years, not all-time totals, so they reflect current conditions rather than accumulated history. Buildings need at least 10 total complaints to appear.
DOB — Building Safety
Sorted by DOB complaints filed in the last 2 years. Ties broken by serious complaints (Priority A+B) in the same window. Only residential buildings are included (non-residential NTAs excluded).
HPD — Housing Conditions
Sorted by HPD tenant complaints filed in the last 2 years. Ties broken by emergency complaints (Emergency + Immediate Emergency) in the same 2-year window — counting all emergency complaints regardless of whether they are still open.
Limitations
Complaint ≠ confirmed violation
DOB and HPD complaints are reports filed by the public or other agencies — they are not confirmed findings. HPD violations are formally issued after inspection and carry more weight. Scores reflect the full record of complaints and violations, not confirmed outcomes only.
Records begin in 2007
Electronic DOB complaint records are available from 2007 onwards. HPD violation and complaint records vary in depth. Complaints filed before the digital record period, or those never digitized, are not reflected in scores.
BIN matching
All data is attributed to buildings using the Building Identification Number (BIN). If a complaint or violation was filed with a missing or incorrect BIN, it will not appear on the correct building's page and is excluded from scoring.
Scale estimation
Building scale is estimated from footprint area and roof height. Buildings missing either value cannot be size-normalized and fall back to raw count percentiles within their NTA. Scale is a proxy — it does not account for unit density or occupancy.
Sync frequency
All datasets are refreshed periodically from NYC Open Data. There may be a lag of several days between a complaint being filed and it appearing here.
All data is sourced from NYC Open Data and is in the public domain.