Digital Workflow & Provenance Scenario
This guide demonstrates how to build a digital workflow provenance system using TrustWeave and PROV-O (Provenance Ontology) to track and verify the provenance of digital information through processing workflows, transformations, and data lineage.
What You’ll Build
By the end of this tutorial, you’ll have:
- ✅ Created DIDs for workflow participants (agents, activities, entities)
- ✅ Tracked digital information through processing workflows
- ✅ Built provenance chains using PROV-O concepts
- ✅ Issued provenance credentials for workflow steps
- ✅ Verified data lineage and transformation history
- ✅ Anchored provenance records to blockchain
- ✅ Built complete workflow provenance system
Big Picture & Significance
The Digital Provenance Challenge
In digital workflows, information undergoes multiple transformations, processing steps, and handoffs between different agents. Understanding where data came from, how it was processed, and who handled it is critical for trust, compliance, and debugging.
Industry Context:
- Market Size: Data lineage and provenance market projected to reach $2.1 billion by 2027
- Regulatory Pressure: Increasing requirements for data provenance (GDPR, data governance)
- Trust Issues: Need to verify data hasn’t been tampered with
- Compliance: Audit trails required for regulatory compliance
- Debugging: Provenance helps debug data quality issues
Why This Matters:
- Data Trust: Verify data hasn’t been tampered with
- Compliance: Meet regulatory requirements for data lineage
- Debugging: Trace data issues to their source
- Accountability: Know who processed data and when
- Reproducibility: Reproduce data processing workflows
- Transparency: Understand data transformations
The Provenance Problem
Traditional systems struggle with provenance because:
- No Standard Format: Each system tracks provenance differently
- Data Silos: Provenance information is scattered
- No Verification: Can’t verify provenance claims
- Incomplete Records: Missing information about transformations
- No Interoperability: Can’t share provenance across systems
Value Proposition
Problems Solved
- Standard Provenance: PROV-O standard format for interoperability
- Verifiable Provenance: Cryptographic proof of provenance claims
- Complete Lineage: Track data through all transformations
- Interoperability: Standard format works across systems
- Compliance: Automated audit trails for regulatory requirements
- Trust: Verify data hasn’t been tampered with
- Reproducibility: Reproduce data processing workflows
Business Benefits
For Organizations:
- Compliance: Meet regulatory requirements
- Trust: Build trust in data quality
- Debugging: Faster issue resolution
- Accountability: Clear responsibility tracking
For Data Scientists:
- Reproducibility: Reproduce workflows
- Transparency: Understand data transformations
- Collaboration: Share workflows easily
For Regulators:
- Audit Trails: Complete data lineage
- Verification: Verify provenance claims
- Transparency: Understand data processing
ROI Considerations
- Compliance: Automated compliance reduces costs by 50%
- Debugging: Faster issue resolution saves time
- Trust: Increased data trust enables new use cases
- Reproducibility: Enables collaboration and knowledge sharing
Understanding the Problem
Digital workflow provenance faces several critical challenges:
- Data Lineage: Track data through multiple transformations
- Agent Tracking: Know which agents processed data
- Transformation History: Understand what transformations were applied
- Verification: Verify provenance claims
- Interoperability: Share provenance across systems
- Completeness: Ensure complete provenance records
Real-World Pain Points
Example 1: Image Processing Pipeline
- Current: No way to track image through processing steps
- Problem: Can’t verify image authenticity or processing history
- Solution: Verifiable provenance credentials for each processing step
Example 2: Data Science Workflow
- Current: No provenance tracking for data transformations
- Problem: Can’t reproduce results or debug issues
- Solution: Complete provenance chain with verifiable credentials
Example 3: Content Creation
- Current: No way to prove content creation process
- Problem: Can’t verify content authenticity or authorship
- Solution: Provenance credentials tracking creation workflow
How It Works: Provenance Flow
flowchart TD
A["Source Entity<br/>Original Data<br/>Entity DID"] -->|used by| B["Activity Processing Step<br/>Activity DID<br/>Transformation Applied<br/>Agent DID who performed"]
B -->|generated| C["Derived Entity<br/>Processed Data<br/>Entity DID<br/>Provenance Credential"]
C -->|anchors to blockchain| D["Blockchain Anchor<br/>Immutable Provenance Record<br/>Complete Lineage"]
style A fill:#1976d2,stroke:#0d47a1,stroke-width:2px,color:#fff
style B fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#fff
style C fill:#388e3c,stroke:#1b5e20,stroke-width:2px,color:#fff
style D fill:#c2185b,stroke:#880e4f,stroke-width:2px,color:#fff
Key Concepts
PROV-O Concepts
- Entity: A digital object (e.g., image, dataset, document)
- Activity: A processing step or transformation
- Agent: Who or what performed the activity
- Used: Relationship showing entity was used by activity
- Generated: Relationship showing activity generated entity
- WasAttributedTo: Relationship showing entity was attributed to agent
Provenance Credential Types
- Entity Credential: Describes a digital entity
- Activity Credential: Describes a processing activity
- Provenance Chain Credential: Links entities through activities
- Transformation Credential: Describes data transformation
Prerequisites
- Java 21+
- Kotlin 2.2.0+
- Gradle 8.5+
- Basic understanding of Kotlin and coroutines
- Familiarity with PROV-O concepts (helpful but not required)
Step 1: Add Dependencies
Add TrustWeave dependencies to your build.gradle.kts. These modules provide DID support, credential issuance, wallet storage, and the in-memory services used to model provenance workflows.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
dependencies {
// Core TrustWeave modules
implementation("com.trustweave:trustweave-core:1.0.0-SNAPSHOT")
implementation("com.trustweave:trustweave-json:1.0.0-SNAPSHOT")
implementation("com.trustweave:trustweave-kms:1.0.0-SNAPSHOT")
implementation("com.trustweave:trustweave-did:1.0.0-SNAPSHOT")
implementation("com.trustweave:trustweave-anchor:1.0.0-SNAPSHOT")
// Test kit for in-memory implementations
implementation("com.trustweave:trustweave-testkit:1.0.0-SNAPSHOT")
// Kotlinx Serialization
implementation("org.jetbrains.kotlinx:kotlinx-serialization-json:1.6.0")
// Coroutines
implementation("org.jetbrains.kotlinx:kotlinx-coroutines-core:1.7.3")
}
Result: With the dependencies synced, you can run the provenance walkthrough without wiring additional adapters.
Step 2: Setup and Create Entity DIDs
Purpose: Initialize the provenance system and create DIDs for workflow entities.
Why This Matters: In PROV-O, everything is an entity, activity, or agent. Each needs a unique DID for verifiable identity. This enables tracking relationships between entities through activities.
Rationale:
- Entity DIDs: Represent digital objects (images, datasets, documents)
- Activity DIDs: Represent processing steps (transformations, analyses)
- Agent DIDs: Represent who/what performed activities
- Separation: Clear separation enables precise provenance tracking
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import com.trustweave.testkit.did.DidKeyMockMethod
import com.trustweave.testkit.kms.InMemoryKeyManagementService
import com.trustweave.did.DidMethodRegistry
import kotlinx.coroutines.runBlocking
fun main() = runBlocking {
println("=== Digital Workflow & Provenance Scenario ===\n")
// Step 1: Setup services
println("Step 1: Setting up services...")
// Separate KMS for different workflow participants
// This ensures proper key isolation and security
val processorKms = InMemoryKeyManagementService() // For processing agents
val sourceKms = InMemoryKeyManagementService() // For source entities
val didMethod = DidKeyMockMethod(processorKms)
val didRegistry = DidMethodRegistry().apply { register(didMethod) }
println("Services initialized")
}
Step 3: Create Source Entity DID
Purpose: Create DID for the original source entity (e.g., original image).
Why This Matters: The source entity is the starting point of the provenance chain. Its DID provides a persistent identifier that can be referenced throughout the workflow.
Rationale:
- Source Entity: Represents original, unprocessed data
- Entity DID: Provides persistent identifier
- Provenance Start: Beginning of provenance chain
- Verification: Can verify entity hasn’t been tampered with
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Step 2: Create source entity DID
println("\nStep 2: Creating source entity DID...")
// Source entity represents original data before any processing
// In image processing example, this would be the original photograph
// The DID provides persistent identity that survives transformations
val sourceEntityDid = didMethod.createDid()
println("Source Entity DID: ${sourceEntityDid.id}")
// Create entity credential for source
// This credential describes the original entity
val sourceEntityCredential = createEntityCredential(
entityDid = sourceEntityDid.id,
entityType = "Image",
entityHash = "sha256:original-image-hash",
metadata = mapOf(
"format" to "JPEG",
"resolution" to "1920x1080",
"created" to Instant.now().toString()
)
)
println("Source entity credential created:")
println(" - Type: Image")
println(" - Format: JPEG")
println(" - Resolution: 1920x1080")
Step 4: Create Activity and Agent DIDs
Purpose: Create DIDs for processing activity and the agent performing it.
Why This Matters: PROV-O requires tracking who (agent) did what (activity) to which entities. These DIDs enable verifiable provenance relationships.
Rationale:
- Activity DID: Represents the processing step (e.g., “image-resize”)
- Agent DID: Represents who/what performed it (e.g., processing service)
- Relationship Tracking: DIDs enable tracking relationships
- Verification: Can verify who did what and when
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// Step 3: Create activity and agent DIDs
println("\nStep 3: Creating activity and agent DIDs...")
// Activity DID represents a processing step
// In PROV-O, activities are things that happen and transform entities
// Example: "resize-image", "apply-filter", "crop-image"
val resizeActivityDid = didMethod.createDid()
println("Activity (Resize) DID: ${resizeActivityDid.id}")
// Agent DID represents who or what performed the activity
// This could be a person, software service, or automated system
// Example: "image-processing-service", "user-alice"
val processingAgentDid = didMethod.createDid()
println("Agent (Image Processor) DID: ${processingAgentDid.id}")
Step 5: Create Activity Credential
Purpose: Create credential describing the processing activity.
Why This Matters: The activity credential records what transformation was applied. This is critical for reproducibility - you need to know exactly what was done to the data.
Rationale:
- Activity Description: What processing was performed
- Parameters: How the processing was configured
- Timing: When the activity occurred
- Agent Reference: Who/what performed it
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
import com.trustweave.credential.models.VerifiableCredential
import kotlinx.serialization.json.buildJsonObject
import kotlinx.serialization.json.put
import java.time.Instant
// Step 4: Create activity credential
println("\nStep 4: Creating activity credential...")
// Activity credential describes the processing step
// This follows PROV-O "Activity" concept
// Records what transformation was applied and how
val resizeActivityCredential = VerifiableCredential(
id = "https://processor.example.com/activities/${resizeActivityDid.id.substringAfterLast(":")}",
type = listOf("VerifiableCredential", "ActivityCredential", "ProvenanceCredential"),
issuer = processingAgentDid.id, // Agent issues credential about activity they performed
credentialSubject = buildJsonObject {
put("id", resizeActivityDid.id)
put("activity", buildJsonObject {
put("activityType", "resize-image")
put("description", "Resize image to target dimensions")
put("parameters", buildJsonObject {
put("targetWidth", "800")
put("targetHeight", "600")
put("algorithm", "lanczos")
put("maintainAspectRatio", "true")
})
put("startTime", Instant.now().toString())
put("endTime", Instant.now().plusSeconds(5).toString())
put("agentDid", processingAgentDid.id)
})
},
issuanceDate = Instant.now().toString(),
expirationDate = null
)
println("Activity credential created:")
println(" - Type: resize-image")
println(" - Parameters: 800x600, lanczos algorithm")
println(" - Agent: ${processingAgentDid.id}")
Step 6: Create Provenance Chain Credential
Purpose: Create credential linking source entity to derived entity through activity.
Why This Matters: This credential captures the PROV-O relationships: entity was “used” by activity, activity “generated” new entity. This creates the provenance chain.
Rationale:
- Used Relationship: Source entity was used by activity
- Generated Relationship: Activity generated derived entity
- Chain Continuity: Links entities through activities
- Complete Lineage: Enables full provenance tracking
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
// Step 5: Create derived entity DID
println("\nStep 5: Creating derived entity DID...")
// Derived entity is the result of processing
// In PROV-O, activities generate new entities from used entities
// This entity is the resized image
val derivedEntityDid = didMethod.createDid()
println("Derived Entity DID: ${derivedEntityDid.id}")
// Step 6: Create provenance chain credential
println("\nStep 6: Creating provenance chain credential...")
// Provenance chain credential links entities through activities
// This follows PROV-O relationships:
// - sourceEntity was "used" by activity
// - activity "generated" derivedEntity
// This creates verifiable provenance chain
val provenanceChainCredential = VerifiableCredential(
id = "https://processor.example.com/provenance/${sourceEntityDid.id.substringAfterLast(":")}-to-${derivedEntityDid.id.substringAfterLast(":")}",
type = listOf("VerifiableCredential", "ProvenanceChainCredential", "ProvenanceCredential"),
issuer = processingAgentDid.id,
credentialSubject = buildJsonObject {
put("provenance", buildJsonObject {
// PROV-O: Entity that was used
put("usedEntity", buildJsonObject {
put("entityDid", sourceEntityDid.id)
put("entityHash", "sha256:original-image-hash")
put("role", "input")
})
// PROV-O: Activity that used the entity
put("activity", buildJsonObject {
put("activityDid", resizeActivityDid.id)
put("activityType", "resize-image")
})
// PROV-O: Entity that was generated
put("generatedEntity", buildJsonObject {
put("entityDid", derivedEntityDid.id)
put("entityHash", "sha256:resized-image-hash")
put("role", "output")
})
// PROV-O: Agent that performed activity
put("agent", buildJsonObject {
put("agentDid", processingAgentDid.id)
put("agentType", "image-processing-service")
})
// Timestamp of transformation
put("timestamp", Instant.now().toString())
})
},
issuanceDate = Instant.now().toString(),
expirationDate = null
)
println("Provenance chain credential created:")
println(" - Used: ${sourceEntityDid.id}")
println(" - Activity: resize-image")
println(" - Generated: ${derivedEntityDid.id}")
Step 7: Issue Credentials with Proof
Purpose: Cryptographically sign provenance credentials to make them verifiable.
Why This Matters: Cryptographic proof ensures provenance claims cannot be tampered with. This is critical for trust - you need to verify that provenance records are authentic.
Rationale:
- Key Generation: Generate signing key for processing agent
- Proof Generation: Create cryptographic proof
- Credential Issuance: Sign credentials with agent’s key
- Verification: Anyone can verify credentials came from agent
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
import com.trustweave.credential.issuer.CredentialIssuer
import com.trustweave.credential.proof.Ed25519ProofGenerator
import com.trustweave.credential.proof.ProofGeneratorRegistry
import com.trustweave.credential.CredentialIssuanceOptions
// Step 7: Issue credentials with proof
println("\nStep 7: Issuing provenance credentials...")
// Generate agent's signing key
// This key will be used to sign all provenance credentials
// In production, use hardware security module (HSM)
val agentKey = processorKms.generateKey("Ed25519")
// Create proof generator for agent
// Ed25519 provides strong security with good performance
val agentProofGenerator = Ed25519ProofGenerator(
signer = { data, keyId -> processorKms.sign(keyId, data) },
getPublicKeyId = { keyId -> agentKey.id }
)
val agentProofRegistry = ProofGeneratorRegistry().apply {
register(agentProofGenerator)
}
// Create credential issuer
val agentIssuer = CredentialIssuer(
proofGenerator = agentProofGenerator,
resolveDid = { did -> didRegistry.resolve(did) != null },
proofRegistry = agentProofRegistry
)
// Issue activity credential
// This proves the activity was performed by the agent
val issuedActivityCredential = agentIssuer.issue(
credential = resizeActivityCredential,
issuerDid = processingAgentDid.id,
keyId = agentKey.id,
options = CredentialIssuanceOptions(proofType = "Ed25519Signature2020")
)
// Issue provenance chain credential
// This proves the provenance relationships are authentic
val issuedProvenanceChain = agentIssuer.issue(
credential = provenanceChainCredential,
issuerDid = processingAgentDid.id,
keyId = agentKey.id,
options = CredentialIssuanceOptions(proofType = "Ed25519Signature2020")
)
println("Provenance credentials issued:")
println(" - Activity credential: ${issuedActivityCredential.proof != null}")
println(" - Provenance chain: ${issuedProvenanceChain.proof != null}")
Step 8: Build Multi-Step Provenance Chain
Purpose: Extend provenance chain through multiple processing steps.
Why This Matters: Real workflows have multiple steps. Each step generates a new entity and extends the provenance chain. This enables complete lineage tracking.
Rationale:
- Chain Extension: Each step adds to the chain
- Entity References: Each step references previous entity
- Complete Lineage: Enables full workflow tracking
- Verification: Can verify entire chain
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
// Step 8: Build multi-step provenance chain
println("\nStep 8: Building multi-step provenance chain...")
// Real workflows have multiple processing steps
// Each step creates a new entity and extends the provenance chain
// Example: Original Image → Resize → Apply Filter → Crop → Final Image
// Step 2: Apply filter to resized image
val filterActivityDid = didMethod.createDid()
val filteredEntityDid = didMethod.createDid()
// Create filter activity credential
val filterActivityCredential = VerifiableCredential(
type = listOf("VerifiableCredential", "ActivityCredential", "ProvenanceCredential"),
issuer = processingAgentDid.id,
credentialSubject = buildJsonObject {
put("id", filterActivityDid.id)
put("activity", buildJsonObject {
put("activityType", "apply-filter")
put("parameters", buildJsonObject {
put("filterType", "sharpen")
put("intensity", "0.5")
})
put("agentDid", processingAgentDid.id)
})
},
issuanceDate = Instant.now().toString()
)
// Create provenance chain credential linking resized to filtered
// This extends the provenance chain: original → resized → filtered
val filterProvenanceChain = VerifiableCredential(
type = listOf("VerifiableCredential", "ProvenanceChainCredential"),
issuer = processingAgentDid.id,
credentialSubject = buildJsonObject {
put("provenance", buildJsonObject {
// Previous entity (resized image)
put("usedEntity", buildJsonObject {
put("entityDid", derivedEntityDid.id)
put("entityHash", "sha256:resized-image-hash")
})
// Current activity (apply filter)
put("activity", buildJsonObject {
put("activityDid", filterActivityDid.id)
put("activityType", "apply-filter")
})
// Generated entity (filtered image)
put("generatedEntity", buildJsonObject {
put("entityDid", filteredEntityDid.id)
put("entityHash", "sha256:filtered-image-hash")
})
put("agent", buildJsonObject {
put("agentDid", processingAgentDid.id)
})
put("timestamp", Instant.now().toString())
})
},
issuanceDate = Instant.now().toString()
)
// Issue credentials
val issuedFilterActivity = agentIssuer.issue(
credential = filterActivityCredential,
issuerDid = processingAgentDid.id,
keyId = agentKey.id,
options = CredentialIssuanceOptions(proofType = "Ed25519Signature2020")
)
val issuedFilterChain = agentIssuer.issue(
credential = filterProvenanceChain,
issuerDid = processingAgentDid.id,
keyId = agentKey.id,
options = CredentialIssuanceOptions(proofType = "Ed25519Signature2020")
)
println("Multi-step provenance chain created:")
println(" - Step 1: Original → Resized")
println(" - Step 2: Resized → Filtered")
Step 9: Verify Provenance Chain
Purpose: Verify complete provenance chain from source to final entity.
Why This Matters: Verification ensures the provenance chain is authentic and complete. This enables trust in the data processing workflow.
Rationale:
- Chain Verification: Verify each credential in chain
- Continuity Check: Verify entities link correctly
- Agent Verification: Verify agents are legitimate
- Completeness: Ensure no missing steps
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import com.trustweave.credential.verifier.CredentialVerifier
import com.trustweave.credential.CredentialVerificationOptions
// Step 9: Verify provenance chain
println("\nStep 9: Verifying provenance chain...")
// Create verifier to check provenance credentials
val verifier = CredentialVerifier(
didResolver = CredentialDidResolver { did ->
didRegistry.resolve(did).toCredentialDidResolution()
}
)
// Verify all credentials in chain
val provenanceChain = listOf(
issuedProvenanceChain,
issuedFilterChain
)
var chainValid = true
provenanceChain.forEachIndexed { index, credential ->
val verification = verifier.verify(
credential = credential,
options = CredentialVerificationOptions(
checkRevocation = true,
checkExpiration = false
)
)
if (verification.valid) {
println("✅ Step ${index + 1} verified")
} else {
println("❌ Step ${index + 1} verification failed:")
verification.errors.forEach { println(" - $it") }
chainValid = false
}
}
if (chainValid) {
println("✅ Complete provenance chain verified!")
println(" - Can trace from source to final entity")
println(" - All transformations verified")
}
Step 10: Trace Data Lineage
Purpose: Retrieve complete data lineage from source to final entity.
Why This Matters: Data lineage enables understanding how data was transformed. This is critical for debugging, compliance, and trust.
Rationale:
- Lineage Retrieval: Get all steps in workflow
- Entity Tracking: Track entities through transformations
- Activity Tracking: Track activities performed
- Agent Tracking: Track who performed activities
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
// Step 10: Trace data lineage
println("\nStep 10: Tracing data lineage...")
// Function to trace lineage from final entity back to source
fun traceLineage(
finalEntityDid: String,
provenanceChains: List<VerifiableCredential>
): List<LineageStep> {
val lineage = mutableListOf<LineageStep>()
var currentEntityDid = finalEntityDid
// Trace backwards through provenance chain
while (true) {
val chainCredential = provenanceChains.find { credential ->
val generatedEntity = credential.credentialSubject.jsonObject["provenance"]?.jsonObject
?.get("generatedEntity")?.jsonObject
?.get("entityDid")?.jsonPrimitive?.content
generatedEntity == currentEntityDid
} ?: break
val provenance = chainCredential.credentialSubject.jsonObject["provenance"]?.jsonObject
?: break
val usedEntity = provenance["usedEntity"]?.jsonObject
?.get("entityDid")?.jsonPrimitive?.content ?: break
val activity = provenance["activity"]?.jsonObject
?.get("activityType")?.jsonPrimitive?.content ?: break
val agent = provenance["agent"]?.jsonObject
?.get("agentDid")?.jsonPrimitive?.content ?: break
lineage.add(LineageStep(
entityDid = currentEntityDid,
activityType = activity,
agentDid = agent,
previousEntityDid = usedEntity
))
currentEntityDid = usedEntity
}
return lineage.reversed() // Reverse to show source to final
}
// Trace lineage from filtered image back to original
val lineage = traceLineage(
finalEntityDid = filteredEntityDid.id,
provenanceChains = provenanceChain
)
println("Data lineage traced:")
lineage.forEachIndexed { index, step ->
println(" ${index + 1}. ${step.activityType} by ${step.agentDid}")
println(" Entity: ${step.entityDid}")
}
}
data class LineageStep(
val entityDid: String,
val activityType: String,
val agentDid: String,
val previousEntityDid: String
)
fun createEntityCredential(
entityDid: String,
entityType: String,
entityHash: String,
metadata: Map<String, String>
): VerifiableCredential {
return VerifiableCredential(
type = listOf("VerifiableCredential", "EntityCredential", "ProvenanceCredential"),
issuer = entityDid, // Entity issues its own credential
credentialSubject = buildJsonObject {
put("id", entityDid)
put("entity", buildJsonObject {
put("entityType", entityType)
put("entityHash", entityHash)
metadata.forEach { (key, value) ->
put(key, value)
}
})
},
issuanceDate = Instant.now().toString(),
expirationDate = null
)
}
Step 11: Anchor Provenance to Blockchain
Purpose: Create immutable record of provenance chain.
Why This Matters: Blockchain anchoring provides permanent, tamper-proof record of provenance. This enables long-term audit trails and verification.
Rationale:
- Immutability: Cannot be tampered with
- Audit Trail: Permanent record
- Verification: Anyone can verify
- Non-Repudiation: Agents cannot deny activities
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
import com.trustweave.testkit.anchor.InMemoryBlockchainAnchorClient
import com.trustweave.anchor.BlockchainAnchorRegistry
import com.trustweave.anchor.anchorTyped
import kotlinx.serialization.Serializable
import kotlinx.serialization.json.Json
@Serializable
data class ProvenanceRecord(
val sourceEntityDid: String,
val finalEntityDid: String,
val activityCount: Int,
val provenanceDigest: String,
val timestamp: String
)
// Step 11: Anchor provenance to blockchain
println("\nStep 11: Anchoring provenance to blockchain...")
// Setup blockchain client
val anchorClient = InMemoryBlockchainAnchorClient("eip155:1", emptyMap())
val blockchainRegistry = BlockchainAnchorRegistry().apply {
register("eip155:1", anchorClient)
}
// Compute digest of complete provenance chain
// This digest uniquely identifies the entire workflow
val provenanceDigest = com.trustweave.json.DigestUtils.sha256DigestMultibase(
Json.encodeToJsonElement(
com.trustweave.credential.models.VerifiableCredential.serializer(),
issuedProvenanceChain
)
)
// Create provenance record
val provenanceRecord = ProvenanceRecord(
sourceEntityDid = sourceEntityDid.id,
finalEntityDid = filteredEntityDid.id,
activityCount = 2, // resize + filter
provenanceDigest = provenanceDigest,
timestamp = Instant.now().toString()
)
// Anchor to blockchain
val anchorResult = blockchainRegistry.anchorTyped(
value = provenanceRecord,
serializer = ProvenanceRecord.serializer(),
targetChainId = "eip155:137"
)
println("Provenance anchored to blockchain:")
println(" - Transaction hash: ${anchorResult.ref.txHash}")
println(" - Provides immutable provenance record")
println(" - Enables long-term audit trail")
Extensive Step-by-Step Breakdown
Step 1: Setup and Initialization
Purpose: Initialize provenance system with proper key management.
Detailed Explanation:
- Multiple KMS Instances: Separate key management for processors and sources ensures proper isolation
- DID Method Registration: Register DID method for creating identities
- Why Separation Matters:
- Security: If one system compromised, others remain secure
- Accountability: Clear separation of responsibilities
- Scalability: Each system scales independently
Step 2: Create Source Entity DID
Purpose: Establish identity for original data entity.
Detailed Explanation:
- Source Entity: Represents original, unprocessed data
- Entity DID: Provides persistent identifier
- Provenance Start: Beginning of provenance chain
- Why This Matters: Source entity is the root of all provenance. Its DID enables tracking all derived entities.
Step 3: Create Activity and Agent DIDs
Purpose: Establish identities for processing activities and agents.
Detailed Explanation:
- Activity DID: Represents processing step
- Agent DID: Represents who/what performed it
- PROV-O Compliance: Follows PROV-O ontology structure
- Why This Matters: Activities and agents are core PROV-O concepts. Their DIDs enable verifiable provenance relationships.
Step 4: Create Activity Credential
Purpose: Record what processing was performed.
Detailed Explanation:
- Activity Description: What transformation was applied
- Parameters: How transformation was configured
- Timing: When activity occurred
- Why This Matters: Activity credentials enable reproducibility. You can recreate the exact transformation.
Step 5: Create Provenance Chain Credential
Purpose: Link entities through activities.
Detailed Explanation:
- Used Relationship: Source entity was used by activity
- Generated Relationship: Activity generated derived entity
- PROV-O Compliance: Follows PROV-O relationships
- Why This Matters: This credential creates the provenance chain. It proves how entities are related.
Step 6: Issue Credentials with Proof
Purpose: Make provenance credentials verifiable.
Detailed Explanation:
- Key Generation: Generate agent’s signing key
- Proof Generation: Create cryptographic proof
- Credential Issuance: Sign credentials
- Why This Matters: Cryptographic proof ensures provenance cannot be tampered with. This is critical for trust.
Step 7: Build Multi-Step Chain
Purpose: Extend provenance through multiple steps.
Detailed Explanation:
- Chain Extension: Each step adds to chain
- Entity References: Each step references previous entity
- Complete Lineage: Enables full workflow tracking
- Why This Matters: Real workflows have multiple steps. This enables complete provenance tracking.
Step 8: Verify Provenance Chain
Purpose: Ensure provenance chain is authentic.
Detailed Explanation:
- Credential Verification: Verify each credential
- Chain Continuity: Verify entities link correctly
- Agent Verification: Verify agents are legitimate
- Why This Matters: Verification ensures provenance is trustworthy. This enables trust in data processing.
Step 9: Trace Data Lineage
Purpose: Retrieve complete data lineage.
Detailed Explanation:
- Lineage Retrieval: Get all steps in workflow
- Backward Tracing: Trace from final to source
- Complete History: See all transformations
- Why This Matters: Lineage enables understanding data transformations. Critical for debugging and compliance.
Step 10: Anchor to Blockchain
Purpose: Create immutable provenance record.
Detailed Explanation:
- Immutability: Cannot be tampered with
- Audit Trail: Permanent record
- Verification: Anyone can verify
- Why This Matters: Blockchain provides permanent, verifiable record. Critical for long-term audit trails.
Advanced Features
Multi-Agent Workflows
Track workflows with multiple agents:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
fun createMultiAgentProvenance(
sourceEntityDid: String,
intermediateEntityDid: String,
finalEntityDid: String,
agent1Did: String,
agent2Did: String
): List<VerifiableCredential> {
// Agent 1 performs first step
val step1Credential = createProvenanceChainCredential(
usedEntityDid = sourceEntityDid,
activityDid = "activity-1",
generatedEntityDid = intermediateEntityDid,
agentDid = agent1Did
)
// Agent 2 performs second step
val step2Credential = createProvenanceChainCredential(
usedEntityDid = intermediateEntityDid,
activityDid = "activity-2",
generatedEntityDid = finalEntityDid,
agentDid = agent2Did
)
return listOf(step1Credential, step2Credential)
}
Provenance Queries
Query provenance by various criteria:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
fun queryProvenanceByAgent(
agentDid: String,
provenanceChains: List<VerifiableCredential>
): List<VerifiableCredential> {
return provenanceChains.filter { credential ->
val agent = credential.credentialSubject.jsonObject["provenance"]?.jsonObject
?.get("agent")?.jsonObject
?.get("agentDid")?.jsonPrimitive?.content
agent == agentDid
}
}
fun queryProvenanceByActivityType(
activityType: String,
provenanceChains: List<VerifiableCredential>
): List<VerifiableCredential> {
return provenanceChains.filter { credential ->
val activity = credential.credentialSubject.jsonObject["provenance"]?.jsonObject
?.get("activity")?.jsonObject
?.get("activityType")?.jsonPrimitive?.content
activity == activityType
}
}
Real-World Use Cases
1. Image Processing Pipeline
Scenario: Track image through multiple processing steps.
Implementation:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
fun createImageProcessingProvenance(
originalImageDid: String,
processingSteps: List<ProcessingStep>
): List<VerifiableCredential> {
var currentEntityDid = originalImageDid
val provenanceChains = mutableListOf<VerifiableCredential>()
processingSteps.forEach { step ->
val activityDid = step.activityDid
val outputEntityDid = step.outputEntityDid
val chainCredential = createProvenanceChainCredential(
usedEntityDid = currentEntityDid,
activityDid = activityDid,
generatedEntityDid = outputEntityDid,
agentDid = step.agentDid
)
provenanceChains.add(chainCredential)
currentEntityDid = outputEntityDid
}
return provenanceChains
}
data class ProcessingStep(
val activityDid: String,
val outputEntityDid: String,
val agentDid: String
)
2. Data Science Workflow
Scenario: Track data through data science pipeline.
Implementation:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
fun createDataScienceProvenance(
rawDataDid: String,
transformations: List<DataTransformation>
): List<VerifiableCredential> {
var currentDataDid = rawDataDid
val provenanceChains = mutableListOf<VerifiableCredential>()
transformations.forEach { transformation ->
val outputDataDid = transformation.outputDataDid
val chainCredential = VerifiableCredential(
type = listOf("VerifiableCredential", "ProvenanceChainCredential"),
issuer = transformation.analystDid,
credentialSubject = buildJsonObject {
put("provenance", buildJsonObject {
put("usedEntity", buildJsonObject {
put("entityDid", currentDataDid)
})
put("activity", buildJsonObject {
put("activityDid", transformation.transformationDid)
put("activityType", transformation.type)
put("parameters", transformation.parameters)
})
put("generatedEntity", buildJsonObject {
put("entityDid", outputDataDid)
})
put("agent", buildJsonObject {
put("agentDid", transformation.analystDid)
})
})
},
issuanceDate = Instant.now().toString()
)
provenanceChains.add(chainCredential)
currentDataDid = outputDataDid
}
return provenanceChains
}
data class DataTransformation(
val transformationDid: String,
val type: String,
val parameters: Map<String, String>,
val outputDataDid: String,
val analystDid: String
)
Benefits
- Standard Provenance: PROV-O standard format
- Verifiable Provenance: Cryptographic proof
- Complete Lineage: Track all transformations
- Interoperability: Works across systems
- Compliance: Automated audit trails
- Reproducibility: Reproduce workflows
- Trust: Verify data processing
- Debugging: Trace issues to source
- Accountability: Know who did what
- Transparency: Understand transformations
Best Practices
- PROV-O Compliance: Follow PROV-O ontology
- Complete Records: Record all processing steps
- Entity References: Always reference entity DIDs
- Activity Details: Record activity parameters
- Agent Tracking: Track who performed activities
- Timestamps: Record when activities occurred
- Chain Verification: Verify entire chain
- Blockchain Anchoring: Anchor critical provenance
- Error Handling: Handle verification failures
- Documentation: Document workflow steps
Next Steps
- Learn about Earth Observation Scenario for related integrity concepts
- Explore News Industry Scenario for content provenance
- Check out Supply Chain & Traceability Scenario for related tracking
- Review Core Concepts: Blockchain Anchoring for anchoring details