---
title: "Benchmarks"
canonical_url: "https://www.sorena.io/solutions/benchmarks"
source_url: "https://www.sorena.io/solutions/benchmarks"
author: "Sorena AI"
---
**[SORENA](https://www.sorena.io/)** - AI-Powered GRC Platform

[Home](https://www.sorena.io/) | [Solutions](https://www.sorena.io/solutions) | [Artifacts](https://www.sorena.io/artifacts) | [About Us](https://www.sorena.io/about-us) | [Contact](https://www.sorena.io/contact) | [Portal](https://app.sorena.io)

---

# Benchmarks

*Benchmark Report*

January 2026 | Two Auditors | Source Citations | Two Passes

Two auditors scored both Sorena Research Copilot and ChatGPT (baseline) against the same requirements across 43 real-world compliance, regulatory research, and document analysis sessions.

[Try Research Copilot](/solutions/research-copilot.md) | [View Methodology](#methodology)

## Key Results

| Metric | Sorena Research Copilot | ChatGPT (baseline) |
| --- | --- | --- |
| Perfect sessions | 43/43 | 0/43 |
| Average coverage | 100% | 25% |
| Requirements evaluated | 4332/4332 | Avg of 2 passes |
| Factual errors | 0 | 183 |

## Benchmark Breakdown

How each tool performed across compliance research sessions

Scores reflect independent verification against source documentation.

### Coverage by Task Type

| Category | Sessions | Sorena Coverage | ChatGPT Coverage | Gap | ChatGPT Factual Errors |
| --- | ---: | ---: | ---: | ---: | ---: |
| Privacy Audits | 12 | 100% | 30% | 70pp | 43 |
| AI Act Audits | 6 | 100% | 28% | 72pp | 20 |
| Timelines | 3 | 100% | 18% | 82pp | 17 |
| Sustainability | 9 | 100% | 21% | 79pp | 53 |
| Employment Law | 2 | 100% | 18% | 82pp | 3 |
| Technical Review | 11 | 100% | 28% | 72pp | 47 |

### Factual Errors

Incorrect statements presented as fact across all sessions.

- Sorena: 0 errors
- ChatGPT: 183 errors

### Requirement Coverage

Compliance requirements addressed with accurate information.

- Sorena: 100% coverage
- ChatGPT: 25% coverage

## Results by Research Session

Session-by-session benchmarks for compliance research. Click any row to view the full scenario, score breakdown, and high-level takeaways.

Scores reflect independent verification against source documentation.

| # | Category | Scenario | Sorena | ChatGPT | Errors | Pass 1 (S/C/T) | Pass 2 (S/C/T) |
| ---: | --- | --- | ---: | ---: | ---: | --- | --- |
| 1 | Privacy Audit | Privacy Notice Audit - Global e-commerce retailer | 100% | 38% | 5 | 32/16/32 | 95/25/95 |
| 2 | AI Act Compliance | AI Terms & Privacy Audit - AI lab | 100% | 19% | 3 | 42/13/42 | 156/12/156 |
| 3 | Privacy Audit | Privacy Policy Audit - Consumer device manufacturer | 100% | 21% | 5 | 40/14/40 | 247/19/247 |
| 4 | AI Act Compliance | Cloud Service Terms Audit - Major cloud provider | 100% | 19% | 4 | 49/14/49 | 205/19/205 |
| 5 | Regulatory Timeline | EUDR Timeline - Office equipment manufacturer | 100% | 19% | 9 | 30/9/30 | 207/17/207 |
| 6 | Regulatory Timeline | EUDR Timeline - Beverage multinational | 100% | 13% | 7 | 46/8/46 | 204/16/204 |
| 7 | Regulatory Timeline | EU Data Act Timeline - Connected appliance manufacturer | 100% | 26% | 1 | 28/11/28 | 33/4/33 |
| 8 | Privacy Audit | Privacy Policy Audit - Gaming platform | 100% | 43% | 3 | 28/16/28 | 47/14/47 |
| 9 | AI Act Compliance | AI Terms & Privacy Audit - AI platform | 100% | 48% | 6 | 43/23/43 | 94/40/94 |
| 10 | AI Act Compliance | Cloud Terms + DPA Audit - Cloud provider | 100% | 33% | 5 | 145/49/145 | 190/63/190 |
| 11 | AI Act Compliance | AI API Terms + Privacy Audit - Model API provider | 100% | 28% | 1 | 45/14/45 | 88/22/88 |
| 12 | Privacy Audit | Privacy Policy Audit - Global search platform | 100% | 25% | 4 | 38/12/38 | 105/20/105 |
| 13 | Privacy Audit | Privacy Policy Audit - Social platform | 100% | 22% | 5 | 50/19/50 | 92/5/92 |
| 14 | Privacy Audit | Privacy Statement Audit - Enterprise software vendor | 100% | 47% | 2 | 70/20/70 | 38/25/38 |
| 15 | AI Act Compliance | Product Terms + Privacy Audit - Enterprise cloud/vendor | 100% | 31% | 1 | 72/26/72 | 47/12/47 |
| 16 | Privacy Audit | Privacy Statement Audit - Streaming service | 100% | 33% | 5 | 41/19/41 | 74/14/74 |
| 17 | Privacy Audit | Terms + Privacy Audit - Secure messaging app | 100% | 32% | 1 | 38/18/38 | 67/11/67 |
| 18 | Privacy Audit | Privacy Policy Audit - Music streaming service | 100% | 36% | 1 | 40/21/40 | 84/17/84 |
| 19 | Privacy Audit | Privacy Policy Audit - Messaging platform | 100% | 56% | 1 | 45/18/45 | 28/20/28 |
| 20 | Privacy Audit | Privacy Policy Audit - Short-form video platform | 100% | 24% | 6 | 38/14/38 | 103/12/103 |
| 21 | Privacy Audit | Privacy Policy Audit - Social network | 100% | 30% | 5 | 120/47/120 | 66/14/66 |
| 22 | Employment Law | Union Comparison - Swedish software developer | 100% | 20% | 1 | 45/10/45 | 52/9/52 |
| 23 | Employment Law | Employment Contract Review - Sweden | 100% | 17% | 2 | 33/8/33 | 53/5/53 |
| 24 | Technical Review | Security Guidelines Review - Connected products | 100% | 25% | 12 | 26/10/26 | 141/16/141 |
| 25 | Technical Review | Cybersecurity Conformity Planning - CE/CRA readiness | 100% | 37% | 5 | 149/65/149 | 114/35/114 |
| 26 | Technical Review | IoT Security Crosswalk + Test Plan - Consumer IoT | 100% | 30% | 4 | 118/45/118 | 90/20/90 |
| 27 | Technical Review | FIPS 140 Delta Analysis - Cryptographic modules | 100% | 41% | 1 | 118/54/118 | 58/21/58 |
| 28 | Technical Review | FIPS ↔ ISO Crypto Module Mapping | 100% | 34% | 1 | 110/42/110 | 62/19/62 |
| 29 | Technical Review | ISO 27001/27002 Migration Package - ISMS update | 100% | 34% | 4 | 130/54/130 | 88/24/88 |
| 30 | Technical Review | NIST 800-53 ↔ ISO 27001/27002 Mapping | 100% | 12% | 5 | 175/40/175 | 68/1/68 |
| 31 | Technical Review | NIST CSF 1.1 to 2.0 Crosswalk | 100% | 29% | 5 | 180/72/180 | 39/7/39 |
| 32 | Technical Review | NIST 800-171 Rev. 3 Delta + CMMC Mapping | 100% | 18% | 4 | 170/30/170 | 79/14/79 |
| 33 | Technical Review | OT Security Framework Crosswalk + Gaps (IEC 62443/NIST) | 100% | 20% | 3 | 99/20/99 | 99/20/99 |
| 34 | Technical Review | PCI DSS v3.2.1 to v4.0 Delta + Crosswalk | 100% | 32% | 3 | 155/52/155 | 155/47/155 |
| 35 | Sustainability Compliance | EU Energy Efficiency Directive Readiness - IoT appliances | 100% | 26% | 8 | 81/21/81 | 83/22/83 |
| 36 | Sustainability Compliance | ESPR + Digital Product Passport Readiness - Appliances | 100% | 22% | 2 | 122/25/122 | 112/26/112 |
| 37 | Sustainability Compliance | EU Batteries Regulation Readiness - Embedded batteries | 100% | 27% | 6 | 147/35/147 | 112/34/112 |
| 38 | Sustainability Compliance | EU CSDDD Readiness - Supply chain due diligence | 100% | 34% | 4 | 98/28/98 | 98/38/98 |
| 39 | Sustainability Compliance | EU CSRD/ESRS Compliance Plan - Listed appliance manufacturer | 100% | 20% | 11 | 119/19/119 | 104/25/104 |
| 40 | Sustainability Compliance | EU CSRD/ESRS Compliance Plan - Listed automotive manufacturer | 100% | 25% | 5 | 72/19/72 | 100/23/100 |
| 41 | Sustainability Compliance | EU Green Claims Readiness - IoT appliances | 100% | 12% | 6 | 89/11/89 | 99/12/99 |
| 42 | Sustainability Compliance | EU Packaging Waste EPR Readiness - Appliances | 100% | 14% | 6 | 103/14/103 | 119/17/119 |
| 43 | Sustainability Compliance | EU Water Sustainability Readiness - IoT appliances | 100% | 14% | 5 | 145/29/145 | 137/12/137 |

*Pass columns show Sorena/ChatGPT/Total requirements for each pass.*

### Session Details

#### Session 1: Privacy Notice Audit - Global e-commerce retailer

**Category:** Privacy Audit | **Date:** 2026-01-06

Audit of a global e-commerce privacy notice against GDPR and CPRA/CCPA, focusing on transparency, retention, cross-border transfers, and user rights.

- **Sorena highlight:** Verified the current US and EU/UK notices, mapped GDPR + CPRA requirements to exact policy text, and produced a gap list with remediation steps.
- **ChatGPT issue:** Couldn’t validate the live notices, relied on outdated sources, and missed core disclosures (rights, request methods, categories).
- **Score:** Sorena 100% vs ChatGPT 38% | 5 factual errors
- **Pass 1:** Sorena 32/32, ChatGPT 16/32
- **Pass 2:** Sorena 95/95, ChatGPT 25/95

#### Session 2: AI Terms & Privacy Audit - AI lab

**Category:** AI Act Compliance | **Date:** 2026-01-06

Audit of an AI lab’s consumer terms and privacy policy for EU AI Act and GDPR, focusing on provider duties, transparency, and operational compliance.

- **Sorena highlight:** Mapped GDPR accountability and EU AI Act GPAI duties (copyright/TDM, watermarking, transparency) into an audit-ready checklist with citations.
- **ChatGPT issue:** Stayed high-level, omitted key GDPR accountability and AI Act obligations, and leaned on secondary sources.
- **Score:** Sorena 100% vs ChatGPT 19% | 3 factual errors
- **Pass 1:** Sorena 42/42, ChatGPT 13/42
- **Pass 2:** Sorena 156/156, ChatGPT 12/156

#### Session 3: Privacy Policy Audit - Consumer device manufacturer

**Category:** Privacy Audit | **Date:** 2026-01-06

Privacy policy audit for a consumer device ecosystem, assessing GDPR/CPRA disclosures, retention clarity, transfers, and rights transparency.

- **Sorena highlight:** Pinpointed GDPR/CPRA gaps like right-to-object and recipient disclosures, backed by precise citations and service-level retention details.
- **ChatGPT issue:** Missed multiple mandatory disclosures and raised a few misleading compliance concerns without grounding in the policy.
- **Score:** Sorena 100% vs ChatGPT 21% | 5 factual errors
- **Pass 1:** Sorena 40/40, ChatGPT 14/40
- **Pass 2:** Sorena 247/247, ChatGPT 19/247

#### Session 4: Cloud Service Terms Audit - Major cloud provider

**Category:** AI Act Compliance | **Date:** 2026-01-06

Contract-focused audit of cloud service terms and privacy notices for EU AI Act and GDPR coverage, including transfers, processor terms, and AI restrictions.

- **Sorena highlight:** Delivered clause-level GDPR processor analysis and AI governance review, including transfer safeguards, AI service restrictions, and concrete next steps.
- **ChatGPT issue:** Skipped contract-specific privacy and AI requirements and made unsupported claims instead of verifying the source terms.
- **Score:** Sorena 100% vs ChatGPT 19% | 4 factual errors
- **Pass 1:** Sorena 49/49, ChatGPT 14/49
- **Pass 2:** Sorena 205/205, ChatGPT 19/205

#### Session 5: EUDR Timeline - Office equipment manufacturer

**Category:** Regulatory Timeline | **Date:** 2026-01-06

EU Deforestation Regulation (EUDR) workback plan for a paper supply chain, with due diligence milestones, evidence expectations, and reporting deadlines.

- **Sorena highlight:** Built an EUDR workback plan anchored to the amended deadlines, product scope, and technical evidence requirements (geolocation, reporting, customs).
- **ChatGPT issue:** Got key dates and scope wrong and missed multiple mandatory EUDR obligations, making the timeline unsafe to rely on.
- **Score:** Sorena 100% vs ChatGPT 19% | 9 factual errors
- **Pass 1:** Sorena 30/30, ChatGPT 9/30
- **Pass 2:** Sorena 207/207, ChatGPT 17/207

#### Session 6: EUDR Timeline - Beverage multinational

**Category:** Regulatory Timeline | **Date:** 2026-01-06

EUDR compliance timeline for a global beverage supply chain, mapping commodity sourcing to scope, due diligence steps, and declaration deadlines.

- **Sorena highlight:** Mapped EUDR obligations to a beverage supply chain with commodity/CN code examples, correct deadlines, and a step-by-step evidence plan.
- **ChatGPT issue:** Misstated go-live dates and scope and omitted many legal requirements and operational steps needed for execution.
- **Score:** Sorena 100% vs ChatGPT 13% | 7 factual errors
- **Pass 1:** Sorena 46/46, ChatGPT 8/46
- **Pass 2:** Sorena 204/204, ChatGPT 16/204

#### Session 7: EU Data Act Timeline - Connected appliance manufacturer

**Category:** Regulatory Timeline | **Date:** 2026-01-06

EU Data Act compliance timeline for a connected-appliance manufacturer, covering data access, sharing, trade secrets, and cloud switching requirements.

- **Sorena highlight:** Provided a date-driven roadmap covering user access/sharing, trade secret safeguards, and cloud switching duties, with a practical workstream plan.
- **ChatGPT issue:** Covered basics but omitted critical obligations like cloud switching rules, gatekeeper restrictions, and Commission guidance milestones.
- **Score:** Sorena 100% vs ChatGPT 26% | 1 factual errors
- **Pass 1:** Sorena 28/28, ChatGPT 11/28
- **Pass 2:** Sorena 33/33, ChatGPT 4/33

#### Session 8: Privacy Policy Audit - Gaming platform

**Category:** Privacy Audit | **Date:** 2026-01-06

Privacy policy audit for a gaming platform, focusing on GDPR transparency and CPRA/CCPA disclosures for California residents.

- **Sorena highlight:** Performed a requirement-by-requirement GDPR + CPRA audit including retention-per-category, request methods, and opt-out signal expectations.
- **ChatGPT issue:** Left out several California and GDPR specifics (submission methods, statutory disclosures) and offered less operational guidance.
- **Score:** Sorena 100% vs ChatGPT 43% | 3 factual errors
- **Pass 1:** Sorena 28/28, ChatGPT 16/28
- **Pass 2:** Sorena 47/47, ChatGPT 14/47

#### Session 9: AI Terms & Privacy Audit - AI platform

**Category:** AI Act Compliance | **Date:** 2026-01-06

Audit of an AI platform’s terms and privacy policy for EU AI Act and GDPR readiness, emphasizing transparency, training boundaries, and provider vs deployer responsibilities.

- **Sorena highlight:** Separated what the documents prove vs what’s missing, covering rights tooling, child handling, security posture, and EU AI Act GPAI obligations.
- **ChatGPT issue:** Missed practical rights pathways and several AI transparency and child-safety requirements, including a document-reference error.
- **Score:** Sorena 100% vs ChatGPT 48% | 6 factual errors
- **Pass 1:** Sorena 43/43, ChatGPT 23/43
- **Pass 2:** Sorena 94/94, ChatGPT 40/94

#### Session 10: Cloud Terms + DPA Audit - Cloud provider

**Category:** AI Act Compliance | **Date:** 2026-01-06

Audit of cloud service terms and a data processing addendum for GDPR Article 28 and EU AI Act readiness, including key contractual caveats and deployer obligations (e.g., FRIA).

- **Sorena highlight:** Mapped processor contract clauses and highlighted high-impact caveats (like pre-release scope exclusions), plus deployer AI Act duties such as FRIA.
- **ChatGPT issue:** Focused on broad GDPR alignment but missed key contractual caveats and most deployer-focused AI Act obligations.
- **Score:** Sorena 100% vs ChatGPT 33% | 5 factual errors
- **Pass 1:** Sorena 145/145, ChatGPT 49/145
- **Pass 2:** Sorena 190/190, ChatGPT 63/190

#### Session 11: AI API Terms + Privacy Audit - Model API provider

**Category:** AI Act Compliance | **Date:** 2026-01-06

Audit of an AI model API’s terms and privacy policy for GDPR and EU AI Act requirements, focusing on data-use boundaries, retention, and developer obligations.

- **Sorena highlight:** Clarified paid vs unpaid data-use boundaries, retention windows, and both GDPR + AI Act transparency duties for developers and deployers.
- **ChatGPT issue:** Skipped controller/legal-basis details and several concrete requirements, and included an incorrect AI Act citation.
- **Score:** Sorena 100% vs ChatGPT 28% | 1 factual errors
- **Pass 1:** Sorena 45/45, ChatGPT 14/45
- **Pass 2:** Sorena 88/88, ChatGPT 22/88

#### Session 12: Privacy Policy Audit - Global search platform

**Category:** Privacy Audit | **Date:** 2026-01-06

Privacy policy audit for a global search platform, assessing data categories, purposes, rights, transfers, retention, and opt-out tooling under GDPR and CPRA.

- **Sorena highlight:** Grounded findings in policy text, covering legal bases, controller identity, opt-out tooling (GPC, ad settings), and actionable fixes.
- **ChatGPT issue:** Missed major requirements (cookies, minors, sources) and made unsupported claims contradicted by the policy.
- **Score:** Sorena 100% vs ChatGPT 25% | 4 factual errors
- **Pass 1:** Sorena 38/38, ChatGPT 12/38
- **Pass 2:** Sorena 105/105, ChatGPT 20/105

#### Session 13: Privacy Policy Audit - Social platform

**Category:** Privacy Audit | **Date:** 2026-01-06

Privacy policy audit for a social platform, focusing on disclosure completeness, legal bases, retention clarity, and rights mechanisms under GDPR and CPRA.

- **Sorena highlight:** Retrieved and analyzed the current geo-dynamic policy plus the US regional notice, verifying sale/share, GPC handling, and rights workflows with quotes.
- **ChatGPT issue:** Couldn’t access the current policy, relied on an outdated version, and missed essential CPRA disclosures and request mechanisms.
- **Score:** Sorena 100% vs ChatGPT 22% | 5 factual errors
- **Pass 1:** Sorena 50/50, ChatGPT 19/50
- **Pass 2:** Sorena 92/92, ChatGPT 5/92

#### Session 14: Privacy Statement Audit - Enterprise software vendor

**Category:** Privacy Audit | **Date:** 2026-01-06

Enterprise privacy statement audit for GDPR and CPRA, focusing on transparency obligations, retention, DSAR mechanics, and user rights coverage.

- **Sorena highlight:** Completed a comprehensive GDPR + CPRA audit with verified opt-out mechanisms, DSAR timelines, and cookie/ePrivacy considerations.
- **ChatGPT issue:** Covered headline items but omitted several statutory details (marketing objection, sources, timelines) needed for a compliance-grade assessment.
- **Score:** Sorena 100% vs ChatGPT 47% | 2 factual errors
- **Pass 1:** Sorena 70/70, ChatGPT 20/70
- **Pass 2:** Sorena 38/38, ChatGPT 25/38

#### Session 15: Product Terms + Privacy Audit - Enterprise cloud/vendor

**Category:** AI Act Compliance | **Date:** 2026-01-06

Audit of enterprise product terms and privacy statements for EU AI Act and GDPR, focused on contractual commitments and shared responsibilities across the AI value chain.

- **Sorena highlight:** Connected product terms, DPA expectations, and AI governance obligations, calling out what must be confirmed contractually vs operationally.
- **ChatGPT issue:** Provided a higher-level review and missed several contract-specific protections and practical compliance actions (breach timing, training safeguards).
- **Score:** Sorena 100% vs ChatGPT 31% | 1 factual errors
- **Pass 1:** Sorena 72/72, ChatGPT 26/72
- **Pass 2:** Sorena 47/47, ChatGPT 12/47

#### Session 16: Privacy Statement Audit - Streaming service

**Category:** Privacy Audit | **Date:** 2026-01-06

Privacy statement audit for a streaming service, evaluating GDPR transparency and CPRA disclosures such as sharing, preference signals, and required policy structure.

- **Sorena highlight:** Verified EU/UK lawful bases and transfer safeguards, and pinpointed California disclosure gaps (GPC, 12-month lists, non-discrimination).
- **ChatGPT issue:** Made incorrect claims about lawful bases, transfers, and CPRA disclosures and conflated DNT vs GPC.
- **Score:** Sorena 100% vs ChatGPT 33% | 5 factual errors
- **Pass 1:** Sorena 41/41, ChatGPT 19/41
- **Pass 2:** Sorena 74/74, ChatGPT 14/74

#### Session 17: Terms + Privacy Audit - Secure messaging app

**Category:** Privacy Audit | **Date:** 2026-01-06

Audit of a secure messaging app’s terms and privacy disclosures for GDPR and CPRA, focusing on lawful bases, retention, rights, and audit-ready gaps.

- **Sorena highlight:** Reviewed multiple relevant sources (policy, shop opt-out page, support guidance) and flagged Art. 27 representative and CPRA signal requirements.
- **ChatGPT issue:** Missed several requirements and confused organizational structure, leading to a misleading compliance conclusion.
- **Score:** Sorena 100% vs ChatGPT 32% | 1 factual errors
- **Pass 1:** Sorena 38/38, ChatGPT 18/38
- **Pass 2:** Sorena 67/67, ChatGPT 11/67

#### Session 18: Privacy Policy Audit - Music streaming service

**Category:** Privacy Audit | **Date:** 2026-01-06

Privacy policy audit for a music streaming service, reviewing GDPR/CPRA disclosures around data categories, sharing, international transfers, and rights.

- **Sorena highlight:** Found and cited specific policy statements (sale/share posture) and assessed children’s protections, authorized agents, and request methods.
- **ChatGPT issue:** Missed key disclosures and incorrectly claimed important statements were absent.
- **Score:** Sorena 100% vs ChatGPT 36% | 1 factual errors
- **Pass 1:** Sorena 40/40, ChatGPT 21/40
- **Pass 2:** Sorena 84/84, ChatGPT 17/84

#### Session 19: Privacy Policy Audit - Messaging platform

**Category:** Privacy Audit | **Date:** 2026-01-06

Privacy policy audit for a messaging platform under GDPR and CPRA, including transfers, retention, rights workflows, and required disclosures.

- **Sorena highlight:** Covered GDPR + CPRA specifics, including timelines, 12-month disclosures, and nuanced cross-regulation considerations (ePrivacy, case law).
- **ChatGPT issue:** Omitted several mandatory CPRA/GDPR elements (non-discrimination, SPI scope, minors) and provided less actionable remediation.
- **Score:** Sorena 100% vs ChatGPT 56% | 1 factual errors
- **Pass 1:** Sorena 45/45, ChatGPT 18/45
- **Pass 2:** Sorena 28/28, ChatGPT 20/28

#### Session 20: Privacy Policy Audit - Short-form video platform

**Category:** Privacy Audit | **Date:** 2026-01-06

Privacy policy audit for a short-form video platform under GDPR and CPRA, focusing on disclosures, rights, ad legal bases, and cross-border processing.

- **Sorena highlight:** Validated EEA/UK disclosures (consent for ads, complaint routes) and pinpointed California statutory gaps like required link text and SPI handling.
- **ChatGPT issue:** Missed several policy-specific disclosures and lacked statutory precision on opt-out and automated decision-making requirements.
- **Score:** Sorena 100% vs ChatGPT 24% | 6 factual errors
- **Pass 1:** Sorena 38/38, ChatGPT 14/38
- **Pass 2:** Sorena 103/103, ChatGPT 12/103

#### Session 21: Privacy Policy Audit - Social network

**Category:** Privacy Audit | **Date:** 2026-01-06

Privacy policy audit for a social network, evaluating GDPR and CPRA transparency items, user rights coverage, and retention disclosures.

- **Sorena highlight:** Mapped the policy to GDPR + CPRA with clear statutory checkpoints (toll-free methods, 12-month lists, SPI limit-use) and actionable remediation.
- **ChatGPT issue:** Skipped critical California format and request-method requirements and left gaps in indirect-source and necessity disclosures.
- **Score:** Sorena 100% vs ChatGPT 30% | 5 factual errors
- **Pass 1:** Sorena 120/120, ChatGPT 47/120
- **Pass 2:** Sorena 66/66, ChatGPT 14/66

#### Session 22: Union Comparison - Swedish software developer

**Category:** Employment Law | **Date:** 2026-01-07

Comparison of Swedish unions and collective agreements for a full-time software developer, covering benefits, tradeoffs, and agreement coverage.

- **Sorena highlight:** Compared unions and collective agreements with the practical details that drive decisions (time bank, sick pay layers, pensions, notice periods).
- **ChatGPT issue:** Stayed at a high level, missed core CBA differences, and included an incorrect benefit-duration claim.
- **Score:** Sorena 100% vs ChatGPT 20% | 1 factual errors
- **Pass 1:** Sorena 45/45, ChatGPT 10/45
- **Pass 2:** Sorena 52/52, ChatGPT 9/52

#### Session 23: Employment Contract Review - Sweden

**Category:** Employment Law | **Date:** 2026-01-07

Employment contract compliance review under Swedish law, identifying risk areas, missing mandatory elements, and practical remediation guidance.

- **Sorena highlight:** Applied the right Swedish-law framework (CBA context, working time limits, deductions, sick pay) and separated real risks from non-issues.
- **ChatGPT issue:** Flagged clauses as violations without CBA context and missed several statutory requirements needed for a defensible review.
- **Score:** Sorena 100% vs ChatGPT 17% | 2 factual errors
- **Pass 1:** Sorena 33/33, ChatGPT 8/33
- **Pass 2:** Sorena 53/53, ChatGPT 5/53

#### Session 24: Security Guidelines Review - Connected products

**Category:** Technical Review | **Date:** 2026-01-07

Technical review of connected product security guidelines, identifying inconsistencies and aligning requirements to real regulatory regimes and standards.

- **Sorena highlight:** Turned internal security guidance into an audit-ready, regulator-aligned checklist (CRA/RED/EN 303 645) with dates, scope boundaries, and citations.
- **ChatGPT issue:** Missed most regulatory specifics and produced multiple incorrect or vague suggestions that weaken the guideline.
- **Score:** Sorena 100% vs ChatGPT 25% | 12 factual errors
- **Pass 1:** Sorena 26/26, ChatGPT 10/26
- **Pass 2:** Sorena 141/141, ChatGPT 16/141

#### Session 25: Cybersecurity Conformity Planning - CE/CRA readiness

**Category:** Technical Review | **Date:** 2026-01-10

Cybersecurity conformity assessment planning for CE/RED readiness, including evidence artifacts, assessment steps, test strategy, and documentation expectations.

- **Sorena highlight:** Produced a CE conformity assessment plan with harmonized-standard traceability (OJ entries, clause IDs) and concrete pass/fail test criteria.
- **ChatGPT issue:** Gave a conceptual plan but missed traceability details auditors need (OJ numbers, provision IDs, acceptance criteria) and had version inconsistencies.
- **Score:** Sorena 100% vs ChatGPT 37% | 5 factual errors
- **Pass 1:** Sorena 149/149, ChatGPT 65/149
- **Pass 2:** Sorena 114/114, ChatGPT 35/114

#### Session 26: IoT Security Crosswalk + Test Plan - Consumer IoT

**Category:** Technical Review | **Date:** 2026-01-10

Consumer IoT security crosswalk and test plan, mapping ETSI and NIST requirements into testable procedures and evidence lists.

- **Sorena highlight:** Delivered a full ETSI EN 303 645 ↔ NIST 8259A crosswalk with quote-level traceability and assessor-grade test techniques.
- **ChatGPT issue:** Provided a general mapping but lacked verifiable quote anchors, missed key control extractions, and suggested risky version/citation approaches.
- **Score:** Sorena 100% vs ChatGPT 30% | 4 factual errors
- **Pass 1:** Sorena 118/118, ChatGPT 45/118
- **Pass 2:** Sorena 90/90, ChatGPT 20/90

#### Session 27: FIPS 140 Delta Analysis - Cryptographic modules

**Category:** Technical Review | **Date:** 2026-01-10

Delta analysis of FIPS 140-1 vs FIPS 140-2 for cryptographic modules, highlighting changed requirements and assessment implications.

- **Sorena highlight:** Captured clause-by-clause deltas with testable requirements, including numeric thresholds and CMVP-ready artifacts.
- **ChatGPT issue:** Covered the basics but missed many validation-critical details (authentication thresholds, DTR/IG references, level-specific RNG rules).
- **Score:** Sorena 100% vs ChatGPT 41% | 1 factual errors
- **Pass 1:** Sorena 118/118, ChatGPT 54/118
- **Pass 2:** Sorena 58/58, ChatGPT 21/58

#### Session 28: FIPS ↔ ISO Crypto Module Mapping

**Category:** Technical Review | **Date:** 2026-01-10

Crosswalk between FIPS and ISO/IEC cryptographic module requirements, mapping controls and clarifying evidence expectations for audits.

- **Sorena highlight:** Mapped FIPS 140-2/140-3 to ISO 19790 with deep links to the SP 800-140x series and validation evidence expectations.
- **ChatGPT issue:** Delivered a partial crosswalk but omitted key precision items (verbatim section quotes, interface taxonomy, program documents).
- **Score:** Sorena 100% vs ChatGPT 34% | 1 factual errors
- **Pass 1:** Sorena 110/110, ChatGPT 42/110
- **Pass 2:** Sorena 62/62, ChatGPT 19/62

#### Session 29: ISO 27001/27002 Migration Package - ISMS update

**Category:** Technical Review | **Date:** 2026-01-10

ISO 27001/27002 migration package from 2013 to 2022, covering control changes, reorganization themes, and statement of applicability updates.

- **Sorena highlight:** Created a practitioner-ready migration kit: machine-readable change matrix, filled SoA examples, and a timeboxed transition plan backed by authoritative sources.
- **ChatGPT issue:** Missed key migration artifacts (Annex B baseline, CSV/filled SoA) and made a couple of unverified claims about control groupings.
- **Score:** Sorena 100% vs ChatGPT 34% | 4 factual errors
- **Pass 1:** Sorena 130/130, ChatGPT 54/130
- **Pass 2:** Sorena 88/88, ChatGPT 24/88

#### Session 30: NIST 800-53 ↔ ISO 27001/27002 Mapping

**Category:** Technical Review | **Date:** 2026-01-10

Control mapping between NIST SP 800-53 Rev. 5 and ISO/IEC 27001:2022 Annex A to support alignment, crosswalks, and audit preparation.

- **Sorena highlight:** Explained rev4 to rev5 changes and produced an auditor-friendly crosswalk to ISO with gaps, tests, and source-grounded rationale.
- **ChatGPT issue:** Relied too much on workbook references and provided fewer verifiable anchors and test/evidence details.
- **Score:** Sorena 100% vs ChatGPT 12% | 5 factual errors
- **Pass 1:** Sorena 175/175, ChatGPT 40/175
- **Pass 2:** Sorena 68/68, ChatGPT 1/68

#### Session 31: NIST CSF 1.1 to 2.0 Crosswalk

**Category:** Technical Review | **Date:** 2026-01-10

Crosswalk from NIST Cybersecurity Framework 1.1 to 2.0, highlighting changes and mapping structure to support transition planning.

- **Sorena highlight:** Mapped CSF changes to practical transition steps and linked crosswalks to authoritative artifacts for traceability and automation.
- **ChatGPT issue:** Delivered a reasonable summary but provided fewer pointers to official mapping exports and less detail on profile/tier migration.
- **Score:** Sorena 100% vs ChatGPT 29% | 5 factual errors
- **Pass 1:** Sorena 180/180, ChatGPT 72/180
- **Pass 2:** Sorena 39/39, ChatGPT 7/39

#### Session 32: NIST 800-171 Rev. 3 Delta + CMMC Mapping

**Category:** Technical Review | **Date:** 2026-01-10

Clause-level delta analysis of NIST SP 800-171 Rev. 2 vs Rev. 3 with CMMC 2.0 mapping, identifying added objectives and assessment impact.

- **Sorena highlight:** Produced audit-defensible deltas with examples, ODP governance, and a clear Rev. 3 to Rev. 2 to CMMC mapping approach.
- **ChatGPT issue:** Listed new requirements but missed assessment-method impacts and source traceability, and added a nonessential news citation.
- **Score:** Sorena 100% vs ChatGPT 18% | 4 factual errors
- **Pass 1:** Sorena 170/170, ChatGPT 30/170
- **Pass 2:** Sorena 79/79, ChatGPT 14/79

#### Session 33: OT Security Framework Crosswalk + Gaps (IEC 62443/NIST)

**Category:** Technical Review | **Date:** 2026-01-10

OT security framework crosswalk between IEC 62443 requirements and NIST SP 800-82 guidance, identifying gaps plus example tests and evidence.

- **Sorena highlight:** Built an OT-safe IEC 62443 ↔ NIST 800-82 crosswalk with verification methods, evidence, and a realistic gap model.
- **ChatGPT issue:** Covered high-level mapping but missed several nuanced gaps and lacked the same level of audit-defensible sourcing.
- **Score:** Sorena 100% vs ChatGPT 20% | 3 factual errors
- **Pass 1:** Sorena 99/99, ChatGPT 20/99
- **Pass 2:** Sorena 99/99, ChatGPT 20/99

#### Session 34: PCI DSS v3.2.1 to v4.0 Delta + Crosswalk

**Category:** Technical Review | **Date:** 2026-01-10

PCI DSS v3.2.1 to v4.0 delta analysis with crosswalks to NIST SP 800-53 Rev. 5 and ISO/IEC 27001:2022, including key changes and timelines.

- **Sorena highlight:** Delivered a PCI DSS migration package with authoritative citations, crosswalks, and evidence-ready remediation guidance.
- **ChatGPT issue:** Missed several audit-defensibility elements (official artifacts, full quotes) and included an incorrect timeline claim.
- **Score:** Sorena 100% vs ChatGPT 32% | 3 factual errors
- **Pass 1:** Sorena 155/155, ChatGPT 52/155
- **Pass 2:** Sorena 155/155, ChatGPT 47/155

#### Session 35: EU Energy Efficiency Directive Readiness - IoT appliances

**Category:** Sustainability Compliance | **Date:** 2026-01-14

Readiness assessment for an EU IoT home-appliance manufacturer under the EU Energy Efficiency Directive, including obligations, exemptions, and a practical implementation plan.

- **Sorena highlight:** Separated what is mandatory vs optional, identified applicability triggers, and produced an evidence-driven readiness roadmap with governance and reporting steps.
- **ChatGPT issue:** Missed or diluted multiple explicit requirements and produced several overconfident obligations without sufficient grounding.
- **Score:** Sorena 100% vs ChatGPT 26% | 8 factual errors
- **Pass 1:** Sorena 81/81, ChatGPT 21/81
- **Pass 2:** Sorena 83/83, ChatGPT 22/83

#### Session 36: ESPR + Digital Product Passport Readiness - Appliances

**Category:** Sustainability Compliance | **Date:** 2026-01-14

Readiness assessment for ESPR and Digital Product Passport obligations for an EU smart-appliance manufacturer, covering applicability, data requirements, and execution plan.

- **Sorena highlight:** Mapped the expected DPP/ESPR obligations to concrete product, data, and supply-chain controls with implementation sequencing.
- **ChatGPT issue:** Left key requirements vague, missed multiple Sorena-identified constraints, and under-specified required artifacts and scope conditions.
- **Score:** Sorena 100% vs ChatGPT 22% | 2 factual errors
- **Pass 1:** Sorena 122/122, ChatGPT 25/122
- **Pass 2:** Sorena 112/112, ChatGPT 26/112

#### Session 37: EU Batteries Regulation Readiness - Embedded batteries

**Category:** Sustainability Compliance | **Date:** 2026-01-14

Readiness plan for EU Batteries Regulation obligations relevant to consumer appliances with embedded or supplied batteries, including labeling, due diligence, and reporting.

- **Sorena highlight:** Provided a compliance-ready breakdown of obligations by battery type and role (producer/importer), with evidence deliverables and timeline discipline.
- **ChatGPT issue:** Overlooked several explicit obligations and mis-prioritized workstreams, creating compliance gaps for key battery-related duties.
- **Score:** Sorena 100% vs ChatGPT 27% | 6 factual errors
- **Pass 1:** Sorena 147/147, ChatGPT 35/147
- **Pass 2:** Sorena 112/112, ChatGPT 34/112

#### Session 38: EU CSDDD Readiness - Supply chain due diligence

**Category:** Sustainability Compliance | **Date:** 2026-01-14

Readiness assessment for EU corporate sustainability due diligence obligations for an EU-listed appliance manufacturer, including governance, risk mapping, and remediation.

- **Sorena highlight:** Turned due diligence requirements into implementable controls: governance, policy, risk mapping, supplier engagement, grievance handling, and reporting.
- **ChatGPT issue:** Missed several explicit duties and introduced ambiguous guidance that would leave audit-critical evidence and controls incomplete.
- **Score:** Sorena 100% vs ChatGPT 34% | 4 factual errors
- **Pass 1:** Sorena 98/98, ChatGPT 28/98
- **Pass 2:** Sorena 98/98, ChatGPT 38/98

#### Session 39: EU CSRD/ESRS Compliance Plan - Listed appliance manufacturer

**Category:** Sustainability Compliance | **Date:** 2026-01-14

CSRD/ESRS compliance applicability and readiness plan for an EU-listed smart-appliance manufacturer, including reporting scope, materiality, assurance, and data controls.

- **Sorena highlight:** Clarified applicability boundaries and produced a compliance program plan spanning governance, materiality, ESRS datapoints, assurance, and disclosure logistics.
- **ChatGPT issue:** Had multiple scope/timeline inconsistencies and missed high-impact requirements around reporting mechanics and assurance-readiness artifacts.
- **Score:** Sorena 100% vs ChatGPT 20% | 11 factual errors
- **Pass 1:** Sorena 119/119, ChatGPT 19/119
- **Pass 2:** Sorena 104/104, ChatGPT 25/104

#### Session 40: EU CSRD/ESRS Compliance Plan - Listed automotive manufacturer

**Category:** Sustainability Compliance | **Date:** 2026-01-14

CSRD/ESRS applicability and compliance plan for an EU-listed automotive manufacturer, including ESRS scope, phased timelines, and operational reporting readiness.

- **Sorena highlight:** Provided a structured, evidence-driven program plan with clear scoping, sequencing, and accountability to operationalize ESRS reporting.
- **ChatGPT issue:** Missed multiple explicit requirements and introduced misleading simplifications that would create gaps in CSRD reporting readiness.
- **Score:** Sorena 100% vs ChatGPT 25% | 5 factual errors
- **Pass 1:** Sorena 72/72, ChatGPT 19/72
- **Pass 2:** Sorena 100/100, ChatGPT 23/100

#### Session 41: EU Green Claims Readiness - IoT appliances

**Category:** Sustainability Compliance | **Date:** 2026-01-14

Readiness assessment for EU green-claims compliance in marketing and product communications for an EU IoT appliance manufacturer.

- **Sorena highlight:** Converted green-claims obligations into a practical substantiation workflow: claims inventory, evidence standards, governance, and review gates.
- **ChatGPT issue:** Overlooked key compliance requirements and provided under-scoped guidance that could increase greenwashing risk.
- **Score:** Sorena 100% vs ChatGPT 12% | 6 factual errors
- **Pass 1:** Sorena 89/89, ChatGPT 11/89
- **Pass 2:** Sorena 99/99, ChatGPT 12/99

#### Session 42: EU Packaging Waste EPR Readiness - Appliances

**Category:** Sustainability Compliance | **Date:** 2026-01-14

Packaging waste and EPR compliance readiness plan for an EU home-appliance manufacturer, covering registration, reporting, labeling, and operational controls.

- **Sorena highlight:** Outlined a compliance-ready EPR program with country-by-country obligations, operational ownership, and reporting/evidence requirements.
- **ChatGPT issue:** Missed several explicit obligations and under-specified evidence and process controls required for multi-country EPR compliance.
- **Score:** Sorena 100% vs ChatGPT 14% | 6 factual errors
- **Pass 1:** Sorena 103/103, ChatGPT 14/103
- **Pass 2:** Sorena 119/119, ChatGPT 17/119

#### Session 43: EU Water Sustainability Readiness - IoT appliances

**Category:** Sustainability Compliance | **Date:** 2026-01-14

EU water-sustainability and water-efficiency compliance readiness plan for IoT appliances, including product efficiency, disclosures, and governance.

- **Sorena highlight:** Translated water-efficiency obligations and expectations into actionable controls, product requirements, and evidence-backed readiness steps.
- **ChatGPT issue:** Left multiple requirements uncovered and included misleading generalizations that would not hold up in an audit.
- **Score:** Sorena 100% vs ChatGPT 14% | 5 factual errors
- **Pass 1:** Sorena 145/145, ChatGPT 29/145
- **Pass 2:** Sorena 137/137, ChatGPT 12/137

## Why This Matters for Your Organization

Purpose-built AI for compliance research delivers measurable advantages

- **Complete Coverage**: 100% coverage across 4,332 requirements, with no surprise gaps left for auditors to find.
- **Zero Factual Errors**: 0 factual errors flagged across 43 sessions, reducing the risk of acting on incorrect information.
- **Audit-Ready Citations**: Direct links to exact text passages in legal documents for full traceability.
- **Specialized Expertise**: Purpose-built for regulatory research, not a general-purpose tool stretched thin.

## How We Evaluated

Transparent two-step scoring process with an independent second review

### Evaluation Overview

| Field | Value |
| --- | --- |
| Period | Jan 2026 |
| Task Categories | 6 |
| Total Sessions | 43 |
| Requirements Evaluated | 4,332 |
| Internet Access | Enabled |
| Reasoning Effort | High |

### Scoring Criteria

A requirement was marked correct only if the response:

- Explicitly addressed the requirement
- Provided accurate information
- Cited verifiable sources where applicable

### Independent Dual Review

Each session was scored independently by two auditors. Neither auditor saw the other's evaluation until scoring was complete.

- Auditor 1: Independent review against compliance requirements
- Auditor 2: Independent review against compliance requirements

Scores shown are the combined average from both auditors.

### Disclaimers

- Results based on internal evaluation conducted January 2026.
- ChatGPT (baseline) is OpenAI ChatGPT, used as a general-purpose AI comparison.
- All factual errors counted are from ChatGPT responses only.
- This evaluation focused on regulatory and compliance research tasks.
- Results may vary depending on specific use case and document types.
- Not a substitute for legal counsel or professional advice.

## Related Solutions

- [Research Copilot](/solutions/research-copilot.md)
- [ESG Compliance](/solutions/esg-compliance.md)
- [Assessment Autopilot](/solutions/assessment.md)
- [All Solutions](/solutions.md)

## Ready to Experience the Difference?

See how our Research Copilot can transform your compliance research with a personalized demo.

[Schedule a Demo](/contact.md)


---

[Privacy Policy](https://www.sorena.io/privacy) | [Terms of Use](https://www.sorena.io/terms-of-use) | [DMCA](https://www.sorena.io/dmca) | [About Us](https://www.sorena.io/about-us)

(c) 2026 Sorena AB (559573-7338). All rights reserved.

Source: https://www.sorena.io/solutions/benchmarks
