Archiving, eDiscovery & Compliance Glossary
Plain-language definitions of the terms that matter in enterprise information archiving, eDiscovery, and regulatory compliance — from WORM storage and legal holds to FOIA, SEC 17a-4, and EDRM.
Core concepts
- Information Archiving
- Information archiving is the practice of capturing, indexing, and preserving an organization's communications and electronic records in a secure, searchable, tamper-evident repository so they can be retained, retrieved, and produced for compliance, governance, and discovery.Information Archiving solution →
- Enterprise Information Archiving (EIA)
- Enterprise Information Archiving (EIA) is the software category covering products that capture and retain email, files, and collaboration and messaging data at organization scale for regulatory compliance, eDiscovery, and records management. EIA platforms typically add supervision, retention management, and search on top of a central archive.Complete guide →
- eDiscovery
- Electronic discovery (eDiscovery) is the process of identifying, preserving, collecting, searching, reviewing, and producing electronically stored information (ESI) as evidence in litigation, investigations, or regulatory matters. Modern eDiscovery runs directly against an archive to cut the time and cost of responding to legal requests.eDiscovery solution →
- Email Archiving
- Email archiving is the automated capture and long-term preservation of inbound, outbound, and internal email in a separate, immutable store — independent of the mail server — so messages remain complete, unaltered, and searchable for retention and discovery.
- Data Archiving
- Data archiving moves information that is no longer in active use into long-term, lower-cost, policy-governed storage where it stays retrievable. Unlike a backup (a short-term copy for disaster recovery), an archive is the system of record retained to satisfy compliance and discovery obligations.
- EDRM (Electronic Discovery Reference Model)
- The Electronic Discovery Reference Model (EDRM) is the widely used framework that describes the stages of eDiscovery — information governance, identification, preservation, collection, processing, review, analysis, production, and presentation. EDRM is also a standard export format for moving data between discovery tools.
- Early Case Assessment (ECA)
- Early case assessment (ECA) is the practice of analyzing a potential matter's data early — its volume, key custodians, date ranges, and themes — to estimate risk, cost, and strategy before committing to full review. Running ECA against an archive helps cull irrelevant data and narrow scope.
- Culling
- Culling is the filtering of a collected data set to remove material that is irrelevant, duplicative, or out of scope — by date range, custodian, keyword, file type, or deduplication — so that only relevant documents proceed to costly review.
Capture
- Journaling
- Journaling is a capture method in which a copy of every message is automatically delivered to the archive at the moment it is sent or received, independent of the user's mailbox. Journaling ensures a complete, unaltered record even if a user later deletes the original.Data sources →
- PST / EML / MSG
- PST, EML, and MSG are common email file formats. PST is a Microsoft Outlook container that stores many messages and folders; EML and MSG store individual messages. Archiving and eDiscovery platforms ingest and export these formats to interoperate with mail systems and review tools.
Integrity & retention
- WORM Storage
- WORM (Write Once, Read Many) storage allows data to be written a single time and then read repeatedly but never altered or deleted before its retention period expires. WORM is a core requirement of regulations such as SEC Rule 17a-4 because it makes archived records tamper-evident and immutable.
- Immutability
- Immutability is the property that a stored record cannot be modified or overwritten once committed. Immutable archives use techniques such as WORM storage, cryptographic hashing, and write-protection to guarantee that preserved data is identical to what was originally captured.
- Retention Policy (Retention Schedule)
- A retention policy is a set of rules defining how long each category of record must be kept and what happens when that period ends. Retention schedules are driven by regulation, legal exposure, and business need, and they underpin both compliance and defensible deletion.
- Defensible Deletion
- Defensible deletion is the documented, policy-driven disposal of data that has met its retention requirement and is not subject to any legal hold. Done correctly — with consistent policies and audit trails — it reduces storage cost, risk, and discovery scope while standing up to legal scrutiny.Defensible deletion whitepaper →
- Single-Instance Storage (Deduplication)
- Single-instance storage, or deduplication, stores only one copy of identical content (such as a message sent to many recipients or a repeated attachment) while preserving each reference. It reduces archive size and cost without losing any record.
Compliance & legal
- Legal Hold (Litigation Hold)
- A legal hold (or litigation hold) is a directive that suspends the normal deletion or modification of records relevant to anticipated or pending litigation, investigation, or audit. Data under hold is preserved unaltered — overriding retention schedules — until the hold is released.Compliance & Governance →
- Chain of Custody
- Chain of custody is the documented, unbroken record of how electronic evidence was captured, stored, accessed, and produced — who handled it, when, and what was done. A defensible chain of custody establishes that archived data is authentic and has not been tampered with.
- Audit Trail
- An audit trail is a chronological, tamper-evident log of actions taken within a system — searches, exports, policy changes, and access. Audit trails demonstrate accountability and are required by many compliance regimes to prove how records and the archive itself were handled.
- Supervision (Communications Surveillance)
- Supervision is the systematic review of archived communications against policies and lexicons to detect misconduct, conflicts of interest, or regulatory breaches. FINRA and SEC rules require certain firms to supervise employees' business communications.
- PII / PHI
- Personally Identifiable Information (PII) is data that can identify an individual; Protected Health Information (PHI) is health data tied to an individual under HIPAA. Archives detect and govern PII/PHI to apply the right security, retention, and privacy controls.Knowledge & Insights →
Regulations
- FOIA (Freedom of Information Act)
- The Freedom of Information Act (FOIA), and its state public-records equivalents, gives the public the right to request records held by government agencies. Agencies must search, review, redact, and produce responsive records quickly — making fast, accurate archive search essential.Government archiving →
- SEC Rule 17a-4
- SEC Rule 17a-4 requires broker-dealers to preserve certain electronic records for specified periods in a non-rewriteable, non-erasable (WORM) format, with indexing and prompt retrievability. It is one of the most cited drivers of immutable email and communications archiving in financial services.Financial services archiving →
- FINRA
- The Financial Industry Regulatory Authority (FINRA) oversees U.S. broker-dealers and sets rules for retaining and supervising business communications, including electronic messaging and social media. FINRA expects firms to capture, retain, and review communications and to produce them on request.
- MiFID II
- The Markets in Financial Instruments Directive II (MiFID II) is an EU regulation that, among other things, requires firms to record and retain communications — including phone calls and electronic messages — related to transactions, typically for at least five years.
- HIPAA
- The Health Insurance Portability and Accountability Act (HIPAA) sets U.S. standards for protecting health information. For archiving, HIPAA drives secure capture and retention of communications containing protected health information (PHI), with access controls, encryption, and audit logging.Healthcare archiving →
- CJIS
- The Criminal Justice Information Services (CJIS) Security Policy governs how criminal justice information is accessed, stored, and protected by law enforcement and their vendors. CJIS mandates strong encryption, strict access control, and audit readiness for systems that hold this data.Public safety archiving →
- GDPR
- The General Data Protection Regulation (GDPR) is the EU privacy law governing personal data. It creates obligations such as data minimization and the right to erasure, which archives must reconcile with retention requirements through granular policy and defensible deletion.
Common questions
What is the difference between archiving and backup?
A backup is a short-term, recoverable copy of data used to restore systems after a failure. An archive is the long-term system of record — immutable, indexed, and policy-governed — kept to satisfy compliance and eDiscovery obligations. Backups answer 'can we recover?'; archives answer 'can we prove and produce?'
What is the difference between eDiscovery and archiving?
Archiving is the ongoing capture and preservation of records in a searchable, tamper-evident repository. eDiscovery is the process of finding, reviewing, and producing specific records from that repository (or other sources) as evidence. A strong archive makes eDiscovery faster and cheaper.
Why is WORM storage required for compliance?
Regulations such as SEC Rule 17a-4 require that records be preserved in a non-rewriteable, non-erasable format so they cannot be altered or deleted before retention expires. WORM storage enforces this immutability, making the archive trustworthy as evidence.
How long do I need to retain records?
Retention periods depend on the record type and the regulations that apply — for example, broker-dealer records under SEC 17a-4 and MiFID II often run several years, while other records follow internal or sector-specific schedules. A retention policy maps each category to its required period and disposal rule.
See these concepts in action
Grotabyte unifies archiving, eDiscovery, and compliance across 60+ data sources. Explore the complete guide or book a personalized demo.