Grotabyte
Glossary

Archiving, eDiscovery & Compliance Glossary

Plain-language definitions of the terms that matter in enterprise information archiving, eDiscovery, and regulatory compliance — from WORM storage and legal holds to FOIA, SEC 17a-4, and EDRM.

Core concepts

Information Archiving
Information archiving is the practice of capturing, indexing, and preserving an organization's communications and electronic records in a secure, searchable, tamper-evident repository so they can be retained, retrieved, and produced for compliance, governance, and discovery.Information Archiving solution
Enterprise Information Archiving (EIA)
Enterprise Information Archiving (EIA) is the software category covering products that capture and retain email, files, and collaboration and messaging data at organization scale for regulatory compliance, eDiscovery, and records management. EIA platforms typically add supervision, retention management, and search on top of a central archive.Complete guide
eDiscovery
Electronic discovery (eDiscovery) is the process of identifying, preserving, collecting, searching, reviewing, and producing electronically stored information (ESI) as evidence in litigation, investigations, or regulatory matters. Modern eDiscovery runs directly against an archive to cut the time and cost of responding to legal requests.eDiscovery solution
Email Archiving
Email archiving is the automated capture and long-term preservation of inbound, outbound, and internal email in a separate, immutable store — independent of the mail server — so messages remain complete, unaltered, and searchable for retention and discovery.
Data Archiving
Data archiving moves information that is no longer in active use into long-term, lower-cost, policy-governed storage where it stays retrievable. Unlike a backup (a short-term copy for disaster recovery), an archive is the system of record retained to satisfy compliance and discovery obligations.
EDRM (Electronic Discovery Reference Model)
The Electronic Discovery Reference Model (EDRM) is the widely used framework that describes the stages of eDiscovery — information governance, identification, preservation, collection, processing, review, analysis, production, and presentation. EDRM is also a standard export format for moving data between discovery tools.
Early Case Assessment (ECA)
Early case assessment (ECA) is the practice of analyzing a potential matter's data early — its volume, key custodians, date ranges, and themes — to estimate risk, cost, and strategy before committing to full review. Running ECA against an archive helps cull irrelevant data and narrow scope.
Culling
Culling is the filtering of a collected data set to remove material that is irrelevant, duplicative, or out of scope — by date range, custodian, keyword, file type, or deduplication — so that only relevant documents proceed to costly review.

Capture

Journaling
Journaling is a capture method in which a copy of every message is automatically delivered to the archive at the moment it is sent or received, independent of the user's mailbox. Journaling ensures a complete, unaltered record even if a user later deletes the original.Data sources
PST / EML / MSG
PST, EML, and MSG are common email file formats. PST is a Microsoft Outlook container that stores many messages and folders; EML and MSG store individual messages. Archiving and eDiscovery platforms ingest and export these formats to interoperate with mail systems and review tools.

Integrity & retention

WORM Storage
WORM (Write Once, Read Many) storage allows data to be written a single time and then read repeatedly but never altered or deleted before its retention period expires. WORM is a core requirement of regulations such as SEC Rule 17a-4 because it makes archived records tamper-evident and immutable.
Immutability
Immutability is the property that a stored record cannot be modified or overwritten once committed. Immutable archives use techniques such as WORM storage, cryptographic hashing, and write-protection to guarantee that preserved data is identical to what was originally captured.
Retention Policy (Retention Schedule)
A retention policy is a set of rules defining how long each category of record must be kept and what happens when that period ends. Retention schedules are driven by regulation, legal exposure, and business need, and they underpin both compliance and defensible deletion.
Defensible Deletion
Defensible deletion is the documented, policy-driven disposal of data that has met its retention requirement and is not subject to any legal hold. Done correctly — with consistent policies and audit trails — it reduces storage cost, risk, and discovery scope while standing up to legal scrutiny.Defensible deletion whitepaper
Single-Instance Storage (Deduplication)
Single-instance storage, or deduplication, stores only one copy of identical content (such as a message sent to many recipients or a repeated attachment) while preserving each reference. It reduces archive size and cost without losing any record.

Regulations

FOIA (Freedom of Information Act)
The Freedom of Information Act (FOIA), and its state public-records equivalents, gives the public the right to request records held by government agencies. Agencies must search, review, redact, and produce responsive records quickly — making fast, accurate archive search essential.Government archiving
SEC Rule 17a-4
SEC Rule 17a-4 requires broker-dealers to preserve certain electronic records for specified periods in a non-rewriteable, non-erasable (WORM) format, with indexing and prompt retrievability. It is one of the most cited drivers of immutable email and communications archiving in financial services.Financial services archiving
FINRA
The Financial Industry Regulatory Authority (FINRA) oversees U.S. broker-dealers and sets rules for retaining and supervising business communications, including electronic messaging and social media. FINRA expects firms to capture, retain, and review communications and to produce them on request.
MiFID II
The Markets in Financial Instruments Directive II (MiFID II) is an EU regulation that, among other things, requires firms to record and retain communications — including phone calls and electronic messages — related to transactions, typically for at least five years.
HIPAA
The Health Insurance Portability and Accountability Act (HIPAA) sets U.S. standards for protecting health information. For archiving, HIPAA drives secure capture and retention of communications containing protected health information (PHI), with access controls, encryption, and audit logging.Healthcare archiving
CJIS
The Criminal Justice Information Services (CJIS) Security Policy governs how criminal justice information is accessed, stored, and protected by law enforcement and their vendors. CJIS mandates strong encryption, strict access control, and audit readiness for systems that hold this data.Public safety archiving
GDPR
The General Data Protection Regulation (GDPR) is the EU privacy law governing personal data. It creates obligations such as data minimization and the right to erasure, which archives must reconcile with retention requirements through granular policy and defensible deletion.

Common questions

What is the difference between archiving and backup?

A backup is a short-term, recoverable copy of data used to restore systems after a failure. An archive is the long-term system of record — immutable, indexed, and policy-governed — kept to satisfy compliance and eDiscovery obligations. Backups answer 'can we recover?'; archives answer 'can we prove and produce?'

What is the difference between eDiscovery and archiving?

Archiving is the ongoing capture and preservation of records in a searchable, tamper-evident repository. eDiscovery is the process of finding, reviewing, and producing specific records from that repository (or other sources) as evidence. A strong archive makes eDiscovery faster and cheaper.

Why is WORM storage required for compliance?

Regulations such as SEC Rule 17a-4 require that records be preserved in a non-rewriteable, non-erasable format so they cannot be altered or deleted before retention expires. WORM storage enforces this immutability, making the archive trustworthy as evidence.

How long do I need to retain records?

Retention periods depend on the record type and the regulations that apply — for example, broker-dealer records under SEC 17a-4 and MiFID II often run several years, while other records follow internal or sector-specific schedules. A retention policy maps each category to its required period and disposal rule.

See these concepts in action

Grotabyte unifies archiving, eDiscovery, and compliance across 60+ data sources. Explore the complete guide or book a personalized demo.