Skip to main content
Case StudiesHR Tech

This case study describes a real engagement. Client identity, proprietary details, and specific metrics are anonymized or approximated under NDA.

HR Tech

Resume Screening and Candidate Ranking System

The Problem

Recruiters spending 6+ hours daily on resume screening. 300+ applications per open role with no structured ranking — all assessment was manual, resulting in inconsistent criteria application across reviewers and slow time-to-shortlist.

The Solution

NLP pipeline for resume parsing, skill extraction, and structured scoring against job requirements. Produces a ranked candidate list with per-requirement match scores and extracted evidence, allowing recruiters to validate AI assessments rather than perform raw screening.

85%
Screening Time Reduction
4.2x
Recruiter Throughput
12s
Avg Processing Time
Overview

This engagement automated the resume-to-shortlist stage of a high-volume recruitment operation. The system parses resumes in PDF and Word format, extracts structured candidate profiles (experience timeline, skill inventory, education, certifications), scores each candidate against a configurable job requirement specification, and ranks the pool by composite match score. Recruiters interact with the results through a Next.js review interface that shows the ranked list alongside the specific evidence supporting each score — the extracted text that justified the match rating for each requirement. The system processes 300 resumes in approximately 60 minutes of unattended batch processing, compared to the previous 3–4 days of manual screening.

Challenge

The Challenge

Resume content is highly unstructured and format-variable. The system had to handle single-column and multi-column layouts, tables, skill section formats ranging from bullet lists to prose paragraphs, and date formats with varying levels of precision (month-year, year-only, "current"). Skill extraction was particularly challenging because skill terminology is inconsistent — the same capability might appear as "React.js", "ReactJS", "React (hooks, context, redux)", or simply be implied by project descriptions rather than listed explicitly. Job requirement specifications also varied significantly in quality across the client's different hiring managers, ranging from detailed rubrics to vague one-line descriptions that required normalization before they could be used as scoring targets. Bias mitigation was a stated requirement: the system had to be reviewable and explainable, not a black box score.

Approach

How We Built It

01

Document parsing and information extraction (Weeks 1–3): The parsing pipeline handles PDF and Word inputs with a layout detection pass that identifies section boundaries, headers, and content blocks. spaCy handles named entity recognition for organizations, dates, and locations. A secondary extraction pass using OpenAI GPT-4 processes each experience entry to extract role title, company type, duration, and a normalized skill set implied by the role description — going beyond explicitly listed skills to skills implied by context. All extracted data is stored as structured JSON with source span references, enabling the UI to show exactly which text each data point was extracted from.

02

Skill normalization and taxonomy (Weeks 4–5): We built a skill normalization layer that maps free-text skill mentions to a canonical taxonomy using a combination of exact match, alias lookup, and embedding-based similarity for novel terms. The taxonomy was seeded from the client's existing competency framework and extended with technology-specific entries. Skill relationships (e.g., "React" implies "JavaScript") allow the scoring system to make inferences about implied skills rather than requiring explicit mentions. The taxonomy is maintained as a PostgreSQL table that recruiters can update through the admin interface.

03

Scoring engine and job requirement processing (Weeks 6–7): The scoring engine takes a job requirement specification (structured as a set of required and preferred criteria with configurable weights) and produces per-requirement match scores for each candidate. Required criteria are scored on a 0–3 scale (no match, partial match, strong match, exceeds) with the supporting evidence extracted from the resume. Composite scores weight required criteria 3x over preferred criteria by default, with weight adjustment available per requirement. Job specifications with vague descriptions are flagged for refinement during the upload flow, with specific prompts to help hiring managers provide scoreable criteria.

04

Review interface and deployment (Weeks 8): The Next.js review interface presents the ranked candidate pool with per-requirement score breakdowns and the extracted resume text that supported each score. Recruiters can override scores, add manual notes, and advance candidates to the next stage from the interface. All recruiter actions are logged for audit purposes and to measure score override rates — a high override rate on a specific requirement is a signal that the scoring logic for that criterion needs refinement. Processing time averages 12 seconds per resume at production batch sizes.

Results

What We Delivered

Screening time per open role dropped from an average of 3–4 days of recruiter effort to approximately 4 hours for a 300-application pool — an 85% reduction. Recruiter time is now spent on reviewing and validating AI assessments rather than performing raw screening, which has improved shortlist quality: hiring managers reported higher average interview-to-offer ratios in the 90 days post-deployment.

Recruiter throughput increased 4.2x, measured as the number of open roles actively screened per recruiter per week. Average processing time per resume is 12 seconds in batch mode. The batch processing infrastructure handles the peak intake load of 800+ resumes on Monday mornings (when most applications arrive following weekend job posts) within approximately 2.5 hours of batch job start.

Score override rates stabilized at approximately 9% of candidates per role after the first 4 weeks of operation, which the client's talent acquisition team considered acceptable for an AI-assisted system. The evidence display in the review interface has been specifically cited as the reason override rates are low: recruiters can see exactly why a candidate received a given score, and when the evidence is correct they consistently agree with the AI assessment.

Tech Stack
PythonspaCyOpenAIPostgreSQLNext.jsDocker
Timeline
8 weeks
Team Size
2 engineers

Ready to build something like this?

Tell us what you are building. We will scope it, price it honestly, and give you a clear plan.

Start a Conversation

Free 30-minute scoping call. No obligation.