Sharding Data Layer
The Sharding Data Layer is the engine at the core of Untrace's architecture. It takes any data blob, splits it into cryptographically independent fragments, and distributes those fragments across the decentralized node network — ensuring that no single node, operator, or jurisdiction ever holds enough to reconstruct the original.
Why Sharding?
Traditional storage — cloud, on-premise, or even most blockchains — keeps data whole in one location. That is the root of every data breach: a single target worth attacking.
Untrace's sharding layer eliminates the target. When data is sharded:
- No node holds enough information to breach — fragments are individually meaningless
- No subpoena is sufficient — no single jurisdiction controls enough shards
- No single point of failure exists — the network tolerates node failures without data loss
- No administrator can be compelled — there is no administrator with access to the whole
The Sharding Pipeline
[ Raw Data Blob ]
↓
[ AES-256-GCM Encryption ]
→ Outputs: (encrypted_blob, symmetric_key K)
↓
[ Shamir's Secret Sharing applied to K ]
→ Outputs: N key shares (k₁, k₂, ..., kₙ)
↓
[ Encrypted blob split into N data segments ]
→ Outputs: N segments (s₁, s₂, ..., sₙ)
↓
[ Each node receives: (kᵢ, sᵢ) ]
↓
[ On-chain: Pedersen commitment to all (kᵢ, sᵢ) pairs ]
↓
[ Reconstruction: Collect K-of-N pairs → Lagrange interpolation → decrypt ]
The encryption and sharding happen entirely client-side. The Untrace network sees only encrypted, individually useless fragments.
Shard Distribution
Node Selection
When a vault is created, the protocol selects N nodes for shard assignment using a deterministic but unpredictable selection algorithm seeded by the vault ID and current epoch:
node_set = select_nodes(
vault_id,
epoch,
n = threshold_config.total_shards,
constraints = {
max_per_operator: 1, // No operator gets two shards
geo_diversity: true, // Different countries required
as_diversity: true, // Different autonomous systems required
jurisdiction_diversity: true
}
)
This selection is verifiable on-chain — anyone can confirm that distribution followed the protocol rules.
Threshold Configuration
| Sensitivity Level | K (threshold) | N (total shards) | Tolerates | | ----------------- | ------------- | ---------------- | --------------- | | Standard | 3 | 5 | 2 node failures | | High | 5 | 9 | 4 node failures | | Maximum | 7 | 13 | 6 node failures |
Enterprises with specific compliance requirements can configure custom (K, N) parameters.
Shard Integrity
Each shard pair (kᵢ, sᵢ) is protected by two integrity layers:
1. MAC at the shard level Each shard is authenticated with a Message Authentication Code (MAC). Any tampering with a shard is detected during reconstruction — the corrupted shard is rejected and a replacement is requested from backup nodes.
2. On-chain commitment A Pedersen commitment to all shards is anchored at vault creation time. At reconstruction, the client verifies that retrieved shards match the on-chain commitment before decrypting. No substitution attack is possible.
Shard Replication and Availability
Beyond the primary N nodes, the protocol maintains backup replicas to guarantee availability:
- Each shard has M backup nodes (default M = 2) that hold encrypted copies
- Backup nodes activate automatically if a primary node fails storage proofs
- The vault owner is never required to manage replication manually
- Replication targets maintain the same geographic and jurisdictional diversity constraints
The network is designed to survive the simultaneous failure of any N−K nodes plus their backup replicas without data loss.
Shard Lifecycle
Vault Write
→ Shards created, distributed, committed on-chain
→ Nodes begin serving PoSt (Proof of Spacetime) every epoch
Vault Update
→ New shard generation created for the same vault ID
→ Previous generation retained (versioning) or pruned per policy
→ New on-chain commitment anchored
Vault Delete
→ Deletion instruction signed by owner DID
→ Broadcast to all shard nodes
→ Nodes destroy shards and submit destruction proof
→ On-chain commitment marked as deleted
Deletion is cryptographically enforced — nodes that fail to destroy shards after a deletion order have their stake slashed.
Comparison With Alternative Approaches
| Approach | Breach Possible | Single Point of Failure | Privacy by Default | | -------------------------- | ------------------------------- | ----------------------- | ------------------ | | Centralized cloud | Yes | Yes | No | | Encrypted cloud | Yes (key theft) | Yes | No | | IPFS / Filecoin | Yes (content addressed, public) | Partial | No | | Untrace Sharding Layer | No | None | Yes |
IPFS and Filecoin store whole data objects — sharding in those systems is about redundancy, not privacy. Untrace sharding is different: each fragment is individually encrypted and cryptographically meaningless without the others, and access to the reconstruction key is ZK-gated.
Further Reading
- Shamir Secret Sharing (SSS) — The cryptographic algorithm powering the key sharding
- ZK Data Vaults — How the sharding layer integrates with vault identity and access control
- Web3 Access Control Layer — How ZK proofs gate the shard reconstruction process
- Whitepaper — Full protocol specification including storage proofs