We conduct AI technical due diligence in four layers: capability assessment (does the system do what it claims on your inputs?), code and infrastructure quality (is it maintainable and scalable?), AI-specific technical debt (MLOps maturity, data lineage, evaluation quality, data rights), and risk assessment (vendor lock-in, integration risk, operational risk at scale).
Capability assessment is conducted using your specific test cases, not vendor-provided benchmarks. We design a test dataset representative of your intended use and run the system against it, measuring the metrics that matter for your use case. This is the only reliable basis for acquisition decisions — vendor benchmarks are systematically optimistic.
Due diligence engagement structure
01Scope definition and access requirementsDefine the system components in scope, the acquisition or investment thesis, and the specific capability claims to test. Establish access requirements: API access, code repository, infrastructure documentation, data documentation, interview time with technical leads.
02Independent capability testingDesign and execute tests using inputs representative of your use case. Document performance against your test cases and compare to claimed benchmarks. Map the demo vs. production gap explicitly.
03Infrastructure and code auditReview system architecture, code quality, test coverage, deployment processes, and operational procedures. Assess scalability and identify infrastructure risks at target scale.
04AI-specific debt and data rights assessmentAudit MLOps maturity: experiment tracking, model registry, retraining pipeline, monitoring. Audit training data provenance, annotation quality, and licensing. Identify data rights risks.
05Risk register and findings reportPrioritized findings report separating deal-breaker issues from negotiation-relevant items. Vendor dependency risk, integration risks, and operational cost model at target scale.