Skip to main content

Dealing with personal data in Axon Framework

· 7 min read

This blog was originally posted on blog.the-experts.nl.

AxonIQ has a module for Axon Framework that can encrypt data in events, so you don't have to worry about it. Still, I think this blog has educational purposes, so I'm keeping it here. The solution posed in this blog is still running in production at the Port of Rotterdam.

Privacy regulations are a pain

By now we all know about GDPR, right? It’s the privacy regulation of the EU that gives the customer certain rights about his or her personal data. For instance, they have a right to retrieve all data related to them, or to have certain or all data deleted.

This presents us with a dilemma.

Let’s consider the following event to be in our event store:

{
"payloadType":"com.insidion.CustomerPlacedOrderEvent",
"payload":{
"customerName":"C. Boyle",
"items":[
{
"sku":"4823023",
"amount":5
}
]
}
}

Now C. Boyle calls our company. He wants his data removed, but our event store is immutable. We now have three options:

  • Ignore the request, knowing that it might incur a fine by the authorities.
  • Delete the event (and all possible other data) leaving a gap.
  • Alter the event, masking the value

This means that, if you even can, the only viable option was to alter the event changing the customer’s name to a masked value such as ***. However, modifying the event store is not a good thing to do since you are altering the past. Furthermore, the name is only erased from the event store, but all projections still have the value in the database tables.

Luckily for us, we have found a better way to do this. Some people have gone with a crypto-shredding approach and Axon has a commercial data regulation library that takes care of it for you. Personally, I prefer a more extreme approach.

Ignorance is bliss

You can keep your event store in the dark about the personal data in your system, without losing access to it. We can achieve that with a cool Jackson feature; custom serializers. Let’s dive in.

When you publish an event from an aggregate Axon stores it in the event store. Before being stored it is first processed by a Serializer to convert them in an appropriate format. Serializers are also used to convert them back from that representation when Axon reads the events from the store. Not only events are processed by them, snapshots, commands (when using Axon Server), and the metadata that is stored.

Axon offers three serializer implementations; Xstream, Jackson, and the java serializer. Xstream is the one enabled by default. You could also write one yourself (for example to serialize to YAML). However, simply because I like JSON more than I like XML I have configured axon to use Jackson by writing the following spring boot config.

serializer:
events: jackson
general: jackson
messages: jackson

Now Jackson is in charge! And Jackson has just the feature we need for our cause; custom serializers. You can write your own serializers so certain Java classes are serialized in the way you want them. By creating a PersonalData wrapper for a String we can say to Jackson not to serialize the value of the String, but anything we want instead. You can see the effect of this in the following code.

data class PersonalData(
val value: String,
)

data class UserRealNameChangedEvent(
val realName: PersonalData?,
)

We can now instruct Jackson that every PersonalData object present in an event, metadata, or other location, it should serialize this in another way. In our serializer we will lookup or write the value to a database table and store the id in the JSON instead.

This way personal data never even enters the event store while we can still access the value whenever we want. It also allows us to delete or mask the personal data without altering the event store in any way. Let’s get the serializer to work:

@Component
class PersonalDataJacksonSerializer(private val store: PersonalDataStore) : StdSerializer(PersonalData::class.java) {
override fun serializeWithType(value: PersonalData?, gen: JsonGenerator, serializers: SerializerProvider, typeSer: TypeSerializer) {
this.serialize(value, gen, serializers)
}

override fun serialize(personalData: PersonalData?, gen: JsonGenerator, provider: SerializerProvider) {
if (personalData == null || personalData.value.isBlank()) {
gen.writeNull()
return
}

personalData.storedId = store.retrieveDataId(personalData.value)
if (personalData.storedId == null) {
gen.writeNull()
} else {
gen.writeObject(SerializedPersonalData(personalData.storedId!!))
}
}
}

data class SerializedPersonalData(
val id: Long,
)

As you can see, it is pretty simple. Whenever this serializer encounters a PersonalData class, it will write that value to a database using the ‘PersonalDataStore’, get its id and write that in the JSON instead. We can also use the same principle to revert the process and access the data again. This is what our deserializer does:

@Component
class PersonalDataJacksonDeserializer(private val store: PersonalDataStore) : StdDeserializer(PersonalData::class.java) {
override fun deserializeWithType(p: JsonParser, ctxt: DeserializationContext, typeDeserializer: TypeDeserializer, intoValue: PersonalData) = this.deserialize(p, ctxt)

override fun deserialize(p: JsonParser, ctxt: DeserializationContext) = try {
p.readValueAs(SerializedPersonalData::class.java)?.let {
// Retrieve the actual value with the id provided in the serialized (json) value id
store.retrieveDataValue(it.id)
}
} catch (e: Exception) {
// Something happened. The value does not exist or we have an error. Return masked value
PersonalData("***", -1)
}
}

In conclusion, this approach enables you to keep personal data out of your event store while still being able to see the data, delete the data or mask the data. The only thing you have to do is wrap it in a PersonalData class. Great, isn’t it?

Demo time

Let’s take a look at the following aggregate. The aggregate keeps a user’s real name, wrapped in a PersonalData object. This allows access to the value while prohibiting it from being stored in the event store or in snapshots. In all other ways, it works the same as a String value would.

@Aggregate
class ProfileAggregate {
@AggregateIdentifier
private lateinit var username: String
private lateinit var realName: PersonalData

@CommandHandler
constructor(command: CreateProfileCommand) {
AggregateLifecycle.apply(ProfileCreatedEvent(command.username, PersonalData(command.realName)))
}

@EventSourcingHandler
fun onEvent(event: ProfileCreatedEvent) {
this.username = event.username
this.realName = event.realName
}

constructor() {
// Here for axon
}
}

When we now create an account through the REST-endpoint, the following event is published by the Aggregate.

{
"username": "axon102",
"realName": {
"id": 1
}
}

As you can see. there is no personal data present in the event. Of course, the personal data is still there but is stored in a database table as you can see below.

More code?

I cannot post all the code here, so I selected the important bits and pieces. You can find the full source code of the demo application here: https://github.com/CodeDrivenMitch/axon-102/tree/main/personal-data/src/main/java/com/insidion/axon102 Take a look and try it out for yourself!

Caveats

All power comes at a price. Each time events containing personal data are read it’s necessary to consult a database table. That means the application is a little bit slower when reading and writing events, but I think the impact is negligible unless you got insane amounts of personal data in events, as the lookup is very fast.

The serializer approach we present here can only be used for new events or entire projects. Projects that already have personal data in their event store are at a disadvantage since the store already contains it, so you have to get it out first or decide only new data is written to the database table. It is never too late to implement it and write an upcaster to take advantage of it!

Conclusion

You can use Jackson to your advantage in order to keep personal data from your event store. This saves you the hassle of (illegally) editing your event store or deleting events, if these are even possible.