Data Classes – The Definitive Guide 2021

When it comes to data classes, I often receive questions like: “Hey, what exactly is an aggregate?”, What the heck is a DTO?” or “what is the exact difference between aggregates and entities?”

So, do not waste the time, and let’s start 🙂

Disclaimer: In this tutorial, I will favor C# because this is my primary language. Of course, you can use the same techniques in your preferred language as the language does not really matter, from my point of view.

Disclaimer 2: Data classes are also often called “Models”. I do not really like this term. From my point of view, this is a part of certain pattern like MVVM, MVC and MVP (also called MV*) and this is actually that, what data classes are not.

Disclaimer 3: My optionion is simetimes controvers, because it differ from some other developers with high reputation.

Data Transfer Objects

A simple way to store and retrieve data from any data store. Every data transfer object is independent of each other.

data classes - data transfer objects

Some people call them also POCO (plain old class object) or POJO (plain old java object) but they all mean the same.

To understand them, you just need to imagine a database table. These objects are the representation of these tables without any direct relationship.

Relationships can be established by using Ids, which are also stored within the database.

class IssueDto
{
    public int Id { get; set; }
    public int UserId { get; set; }
    public string Name { get; set; }
    public DateTime Created { get; set; }
    public int Estimate { get; set; }
}
class UserDto
{
    public int Id { get; set; }
    public int RoleId { get; set; }
    public string Name { get; set; }
    public string Email { get; set; }
}

Those objects can be easily mapped to a database table. The further benefit is, that it is not really necessary to store the entire (any maybe complex) aggregates.

DTOs should only be used to communicate with the database and should also be defined within the data layer.

What is a DTO not?

Please look at the name cautiously. It is data transfer object. So, the major purpose of this kind of data classes, is only to transfer data from A to B.

In other words, there is no need to implement any kind of business logic here. You should only design them to have some basic properties.

Value Objects

If you need more than just a number or a text, you can use value objects.

data classes -  value objects

Value objects are really cool. But, from my point of view, they have one major disadvantage. Value objects creating a huge overhead within your code  base.

And if I say a huge overhead, I really mean a lot!

Let’s  look at the DTO example above. We have in total 9 properties. Every property would need to have a value object.

The best practice is, to name them based on the class and property name.

But again, you can only do this, if you have named your properties very well.

class IssueName
{
    private readonly string _value;
}

Value objects need to be immutable. That means, that they cannot be changed once they are created.

Those objects must also guarantee to be the same even if they have been recreated or has a different format.

Consider the following examples. Even the format is different, the value is exact the same. Or, on the other hand, both, the value and the format are the same, but there is a totally different purpose.

var issueId = 42;
var userId = 42;
Console.WriteLine(issueId == userId); // TRUE

var distanceInMeters = "3218 meters";
var distanceInMiles = "2 miles";
Console.WriteLine(distanceInMeters == distanceInMiles); // FALSE

Both results are wrong because the issue id and the user id cannot be the same, even both have the same number. The same for the second example. 2 miles are (more or less) 3218 meters. But if you compare these numbers, it leads you to a wrong result.

Making value objects immutable.

Okay, but how to solve this issue? In C# there is a technique to override the equality members. GetHashCode() and Equals(). As far as I know, there are equivalents in Java.

If you do this, you need to make sure, to override both, Equals and GetHashCode correct because they will also be used identify the value object within a List or a HashSet.

public static bool operator ==(IssueName left, IssueName right)
    => left.Equals(right); // (1)

public static bool operator !=(IssueName left, IssueName right)
    => !left.Equals(right); // (2)

public override bool Equals(object obj)
    => Equals(other as IssueName); // (3)

public bool Equals(IssueName other)
{
    if (ReferenceEquals(other, null)) // (4)
    {
        return false;
    }
    
    return _value == other._value; // (5)
}

public override int GetHashCode()
{
    unchecked // (6)
    {
        var hashCode = -1093546721; // (7)
        hashCode = (hashCode * 42) ^ _value.GetHashCode(); // (8)

        return hashCode;
    }
}

Let quickly go over the code:

  1. We want to use Equals-method we are defining below.
  2. Just a reverse of the above method.
  3. We do not want to have a direct cast here. Please make sure you are using a safe cast. This will pass a null into the next method instead of throwing an InvalidCastException.
  4. If null (or any other object has been passed) we will return false. Do not compare the parameter with this by using ReferenceEquals because we’ll lose all the benefits we want to achieve.
  5. Just compare all relevant properties of the object by using the Equality operator.
  6. The unchecked keyword, will prevent any unexpected issues. Please check the documentation for more information.
  7. Create a random offset for each value object by defining a “random” int. Just make sure it differs for every object. We do not want to generate the same hash for different ValueObjects of the underlying type (e.g. IssueId and CommentId).
  8. Put the same properties, you’ve picked up within the 5th point here in the list as well by duplicating the entire line and exchange the name of the field.

Creating a new instance of a value object

To be honest, I want to keep the functionalities of native types. So, I want to have structures like:

IssueName name = "As project owner, I would like ...";

or

if (a.Name == b.Name)
{
    // name of both value objects is identical
}

Or some math operations (like adding or subtracting values) for numeric types.

This can be achieved with operator overloading.

private IssueName(string value) // (1)
    => _value = value;

public static implicit operator IssueName(string value) // (2)
    => new IssueName(value);

public static explicit operator string(IssueName name) // (3)
    => name?._value ?? string.Empty; // (4)

public override string ToString()
    => (string)this; // (5)
  1. Notice the private constructor. The object itself cannot be created by using the new operator.
  2. This is the one and only place to create the object. By setting the implicit keyword you do not need to cast it into the specific type.
  3. From my experience it is better to use the explicit operator if you want to use the internal value. Do not use the internal value representation directly. Always encapsulate this by using other methods. 
  4. In this case, you can even check if the object is null and return a default value if not. So, your app will not crash if the issue name is null. Surprised? 🙂
  5. As there is a method to convert the object to a string, you can just cast the object to a string and return it back.

String on steroids – with some pitfalls

From now on, we have a pimped string, which is only valid for the name of each issue.

Please bear in mind, that you lose any “native” functionalities if you are using value objects. So, you will not be able to write something like:

issue.Name = issue.Name.ToUpper();

If you want to do this, you need to add this method into your value object.

I highly recommend not to do this. Use descriptive names instead. Why you need to transform the name? What exactly is the requirement. Name this method based on the requirement.

You can add any custom method to your value object. Methods, which can describe your source code much better.

Just to give you an idea, here is an example:

if (issue.Estimation.BelowOneDay())
{
    _issueSolver.TryToSolve(issue);

    if (issue.IsSolved())
    {
        issue.Estimation = issue.Estimation.LogWork("4h");
    }
}
else
{
    // contact product owner
}

If you show such code to anyone with domain knowledge, he will directly understand the workflow.

You will also save many of, so called, helper methods which just do the same job but on different places. 

But does this really worth the effort?

Keep it simple, just KISS

In most cases, you do not need to create a value object for each property for every entity or even aggregate. You can reuse or just combine some properties.

You can create a value object to a date property, or just use some native objects like DateTime or something else.

It is also possible to combine some values. If you want to store any kind of font, it doesn’t make much sense to create a value object for font size, font family (e.g. Arial, Times New Roman), font style (e.g. bold, regular, italic) etc. In this case, it is better to create a value object called “Font” and put all these three properties to it.

This value object can be reused in different other entities within your application.

Value objects and data storage

If a value objects becomes more complex (contain more than 4 properties), you should consider creating a separate database table for it and give them an identifier.

In this case, the value object would immediately become an entity. Usually, there are no specific tables for value objects within the database!

Always use value objects for identifiers

If you – for any reasons – don’t want to use value objects for any types, consider it to use for any kind of identifiers.

But why we should use a new class for each id instead a simple int or GUID?

Okay, take a look at this example value object:

public sealed class CommentId : IValueObject, IEquatable<CommentId>
{
    private readonly uint _value;

    private CommentId(uint id)
    {
        if (id == 0)
        {
            throw new ArgumentException("The comment id must be a positive number.");
        }

        _value = id;
    }

    public static implicit operator CommentId(uint id)
        => new CommentId(id);

    public static explicit operator uint(CommentId id)
        => id._value;

    public static explicit operator string(CommentId id)
        => id._value.ToString();

    public static bool operator ==(CommentId a, CommentId b)
        => a.Equals(b);

    public static bool operator !=(CommentId a, CommentId b)
        => !a.Equals(b);

    public bool Equals(CommentId? other)
    {
        if (ReferenceEquals(other, null))
        {
            return false;
        }

        return _value == (uint) other;
    }

    public override bool Equals(object obj)
        => Equals(obj as CommentId);

    public override int GetHashCode()
    {
        unchecked
        {
            var hashCode = -1093546721;
            hashCode = (hashCode * 42) ^ _value.GetHashCode();

            return hashCode;
        }
    }

    public override string ToString()
        => (string) this;
}

Currently, the underlying type is an uint (I still don’t understand, why some people use ints for an id. Did you ever seen an id -42?). But what if this will be changed to a Guid? Or, for any reason, to a string? This also make it possible, to have a mix between uints and Guids for ids.

If you follow this simple rule, you only need to adjust one data class instead of, at least, two.

By using the fundamentals of the object-oriented programming, you can “soft-validate” your value object.

Take a look at the constructor. I have a validation here, that my ids cannot be in an invalid state.

Value objects will prevent you of making mistakes

If you are using the same type for each ID, you can assign, accidentally a wrong ID to your types.

var issueId = new IssueId(42);

issue.Comments.Add(new Comment()
{
    Id = issueId,
    Content = "ooops"
});

During typing this lines of code, the IDE directly yells at me “DONT DO THIS!!!”. And, to be honest, this is good. Using GUIDs will not have any effects. But using simple integers, can harm your entire application because you can assign a wrong data set to the object.

No language support – no excuse

Unfortunately, not every language supports such features like equals or operator overloading.

In this case, you can do such old technics which our grandpas did back in the 90s.

class CommentId:

    __value = 0

    def __init__(self, value:int):
        if value <= 0:
            raise ValueError("comment id must not be negative")
        
        self.__value = value

    def equals(self, other) -> bool:
        if isinstance(other, type(self)):
            return self.__value == other.__value
        
        return False
        
    def get_value(self):
        return self.__value

You can also use the setter/getter methods to encapsulate your fields. Afterwards, you can access the values by writing code simmilar to that:

my_comment = Comment()
my_comment.set_id(CommentId(2))

comment_id = my_comment.get_id().get_value()

Yes, I know, python actually has some equality comparer as well as properties. So, no need to do this in python. This is only for demonstration purpose. Please do not blame me for that 🙂

However, you will need to be much more strict regarding the types. Such (scripting) languages will not protect you by assigning wrong types to any values. And, as this is a scripting language, you will notice this issue at runtime, not compile-time.

Entities

Most of the data classes are entities. But what are these entities, and what are they using for?

data classes -  domain entities

Entities are the sibling of value objects.

They behave nearly the same, but with two major differences.

Entities are mutable and must have a unique identifier. This identifier should also be a value object.

Entities cannot be standalone

Let’s go ahead with our example domain.

I want to create an entity for a comment because each Issue can have several comments attached. We will create this aggregate in the next step.

public class Comment
{
    public CommentId Id { get; set; }
    public UserId User { get; set; }
    public IssueId Issue { get; set; }
    public DateTime CreationTime { get; set; } = DateTime.UtcNow;
    public CommentText Content { get; set; }
}

Every property of this entity is a value object. No plain strings, ints, doubles, etc. anymore. Only value objects.

In this case, it holds two references to other aggregates. But if I am saying reference, I do not mean a memory reference. It is, again, a value object with the ID of another aggregate. We will come back to this point when it comes to aggregates.

Immutability and data content

Unlike value objects, multiple entities can have the same parameters. This is totally valid. A user can add multiple comments, with the same content to the same issue.

A way better example is a project version. Every project has multiple versions as well as multiple issues. There is a relationship between a version and an issue. But nearly every project has a version “1.0”. In this case, there would be multiple entities with different IDs but the same content.

Or in other words, entities aren’t be reusable. If you are going to create a reusable entity, it will become an aggregate.

Business logic within entities

There are two completely different meanings when it comes to putting business logic inside an entity.

By definition, each entity must guarantee a consistent state. It never can become invalid. I totally agree with this. For this reason, we need to modify the entity a bit:

public class Comment : ILocalEntity
{
    public CommentId Id { get; }
    public UserId User { get; }
    public IssueId Issue { get; }
    public DateTime CreationTime { get; }
    public CommentText Content { get; private set; }

    public Comment(UserId user, IssueId issue)
    {
        User = user;
        Issue = issue;
        CreationTime = DateTime.UtcNow;
    }

    public Comment(CommentId id, UserId user, IssueId issue, DateTime creationTime)
        : this(user, issue)
    {
        Id = id;
        CreationTime = creationTime;
    }

    public void SetContent(CommentText content)
    {
        Content = content;
    }
}

Now, we have two use cases:

  1. Creating a new entity:
    For this, we have the first constructor. Every comment has a user, who is writing the comment and the issue, which the comment belongs to. As the comment is being created right now, we can also set the time here.
  2. Reconstruct from data store:
    in this case, we are reading the comment from the database. It is not possible to change anything. Except the content.

Or in other words: Only create get properties and set them by a separate method. In this case, you would be able to validate the state within your entity. From my point of view, it makes much more sense to put this kind of validation inside the value objects.

I also like to put some “pseudo-validation-logic” inside the entity or, If there is a collection of items, the total number of items. A possible method could be DoesHaveContent or (not in this case) NumberOfComments or CommentCount.

No matter what you do, never ever inject or even instantiate any kind of logic classes inside your entities.

If you really, really need to use any kind of logic, which is not independent of the entity (or value object) consider using the visitor design pattern.

Checking for Equality

While every ValueObject needs to implement some equality members, do not do this for Entities (or even Aggregates).

If you want to compare different entities for equality, consider using the EqualityComparer:

public sealed class CommentEqualityComparer : IEqualityComparer<Comment>
{
    public bool Equals(Comment x, Comment y)
    {
        var anyParameterIsNull = ReferenceEquals(x, null) || ReferenceEquals(y, null); 
        
        if (anyParameterIsNull)
        {
            return false;
        }

        var isEqual = x.Id.Equals(y.Id); 
        isEqual &= x.User.Equals(y.User);
        isEqual &= x.Issue.Equals(y.Issue); 
        isEqual &= x.CreationTime.Equals(y.CreationTime);
        isEqual &= x.Content.Equals(y.Content);
        
        return isEqual;
    }

    public int GetHashCode(Comment obj)
    {
        unchecked
        {
            var hashCode = -8137449;
            hashCode = (hashCode * 42) ^ obj.Id.GetHashCode();
            hashCode = (hashCode * 42) ^ obj.User.GetHashCode();
            hashCode = (hashCode * 42) ^ obj.Issue.GetHashCode();
            hashCode = (hashCode * 42) ^ obj.CreationTime.GetHashCode();
            hashCode = (hashCode * 42) ^ obj.Content.GetHashCode();
            return hashCode;
        }
    }
}

Afterwards, you can pass this EqualityComparer to any kind of HashCollection (e.g. HashSet or Dictionary).

Aggregates

The heart of every domain model. Each business application has at least one aggregate.

data classes -  aggregates

Aggregate are, like entities, also mutable. That means, aggregates can be modified.

Unlike an entity. Each aggregate should be unique by their internal values.

It is possible to have multiple aggregates within the domain. But, unlike entities, aggregates should only be loaded on-demand. The easiest way would be, to hold some references on other aggregates within any parent aggregate or entity.

Example aggregate design

public class Issue
{
    public IssueId Id { get; set; } // (1)
    public IssueName Name { get; set; } // (2)
    public IssuePriority Priority { get; set; } // (3)
    public DateTime CreationTime { get; set; } // (4)
    public UserId Assignee { get; set; } // (5)
    public TimeValue Estimated { get; set; } // (6)
    public List<IssueId> SubTasks { get; } = new List<IssueId>(); // (7)
    public List<Issue> SubTaskAggregates { get; } = new List<Issue>(); // NEVER do this!
    public List<Comment> Comments { get; } = new List<Comment>(); // (8)
}

This aggregate is one of my typical data class designs

  1. The Id is a value object, which holds an integer in this case. But it can also be something other.
  2. Name is also a value object.
  3. Priority is a mixture of an enum and a value object. Just because I dislike magic values within my code 🙂
  4. CreationTime describes the time, when this issue has been created within the issue tracking system. In this case, it is enough to use the dotNET DateTime class. Yes, it is also a value object.
  5. Assignee would usually be a reference to another aggregate. User. From my point of view, the best choice is to hold the soft-reference here and query the data store when you exactly need the entire object. Sometimes it is necessary to display a bit data of the aggregate. In this case, you can introduce a further entity which holds only this value.
  6. TimeValue could also be a build in value object (TimeSpan). But in this special case, I have decided to use a custom one. Just because I want to use some special logic. This value object should be able to handle such values like “4h 30m”. On the other hand, it should also consider working days. So, 24 hours should be 3 days instead of only one.
  7. SubStasks is an optional list of further references on other aggregates.
  8. Comments is a list of other entities.

Lazy vs. eager loading

Maybe you have noticed that I am not loading directly other aggregates but just using some references.

Which can be, indeed, very handy. Consider a data storage with, lets say, half a million datasets.

Now, you want to load one Issue by calling something like this:

var issue = _issueService.GetIssue(issueId);

This would load the issue itself. Once it comes to the project version, it would also load every issue within this specific version. Which may contain, lets say, 113 other issues.

It comes to the assignee. The assignee has already solved thousands other issues. As the User has also a collection all issues, you will also load several thousand other issues. You will again, load each issue, which is assigned to the same user. And some seconds later, you have something like StackOverflowException or deadlock.

This is, of course, the very worst-case scenario. Not every domain has such circular references (this should be avoided anyway). But of course, it can end up with a very complex object structure which contains too much information. Just to display some basic information, you do not need to know any other information about the user or “sibling issues”.

Just consider another domain. If online shops, like amazon, would load every product (including each recession) when you are just click on one category… Yeah, this would be very time-consuming.

To improve your runtime, keep your aggregates as tiny as you can.

Business logic inside an aggregate

Well, we have the same rules here like for entities. Only put business logic, which directly belong to the current state.

But please, avoid putting everything inside your aggregates. Just to get an idea:

public class Issue
{
    // some properties

    public void MoveToBacklock() { }
    public void MoveToReview() { }
    public void Solved() { }
    public void Close(IssueCloseReason reason) { }
    public void ReOpen() { }
    public void IncreasePriority() { }
    public void DecreasePriority() { }
    public void LogTime(TimeValue time) { }
    public void AssignSubTask(IssueId otherIssue) { }
    public void AttachDocument(Attachment attachment) { }
    public void WriteComment(Comment comment) { }
    public void AssignToVersion(ProjectVersion version) { }
    public void AddLabel(IssueLabel label) { }
    // where do you stop?
}

But why this is bad? Well, some of them are ok, as these methods only modify the properties and keep the aggregate’s integrity.

But how many responsibilities do you have here? The first 5 methods, do actually the same. The next two methods modify the priority, which is also the same.

And, this is actually not an extreme. I have seen aggregates with more than 5000 lines of code.

As a rule of thumb, do not have more than 10 methods inside a class. This is, of course, also valid for aggregates.

To solve this, consider writing some logic class like “IssueStatusService” which is responsible for every status change and moving the issue on your board. You can do the same for any other properties as well.

Just keep the integrity in mind and you are safe 🙂

If you follow this rule, you will also follow the “single responsibility principle” as well as the “open close principle”

What is the exact difference between entity and an aggregate?

To be honest, it is not easy to decide, is it an entity or an aggregate?

If you are struggling at this point, just ask yourself: What will happen with this “so far” aggregate, if I delete the parent aggregate? Do not ask you: Do I need any values of this object somewhere?

By deleting (or archiving) an aggregate it needs to be deleted without any leftovers. Everything, what is assigned to it, must be removed from the data store.

Let’s go through the Issue aggregate:

User, Issue (for subtasks), ProjectVersion are other aggregates because they can be loaded independently of the issue, and it would be terrible if we delete them as well by deleting the root issue.

Priority could be an aggregate or just a simple value object. If there is a requirement to control the priority within the admin control panel, then yes, this is an aggregate. But if we can hard-code them, a simple value object is enough.

Comment and Attachment are entities because they can (or must) be deleted once the issue has been removed.

Never do something like that: “I need the username while showing the issue, let’s make the user as an entity”. Trust me, there is absolutely no benefit by doing this.

UI Models

Any application without any kind of display data is useless. Each framework has its own definition of this kind of data classes.

data classes - ui models

The last major data class type are the ui models.

This kind of data classes highly depending on the UI pattern you are using. If you are using WPF and MVVM, you need something like NotifyOnPropertyChanged for each single property. MVC on the other, do not require this event.

Of course, there are many other languages as well as UI pattern.

Ui specific properties

Once it comes to the UI, some properties are really only valid within the UI. Sometimes you already have these values (e.g. issue name) and sometimes not (e.g. estimated time of all issues within a specific sprint).

In this case, you can calculate this values and display it on the UI. There is no need to store such kind of data within your data store.

A further example is any kind of color. Issues, with high priority should be displayed on red. Normal in black and low priority in gray. The issue model can look like the following:

class IssueModel
{
    public string Name { get; set; }
    public string Priority { get; set; }
    public string BackgroundColor { get; set; }
    public string Description { get; set; }
}

No value objects on UI

Maybe you have already noticed. My Ui data classes has only simple data types like string, int, double, etc. There are no value objects, no entities and even no aggregates.

Why? Because each of them are a part of the domain layer, not UI layer.

Designing a domain

How to use these data classes, and how should we use them?

Okay, just to sum it up what we have learned so far and what are the different data classes for.

Data transfer objects: (data layer)

Contain only primitive types, which can also be stored within any type of data store. They do not have dependencies. If so, use only soft references for IDs.

Value objects: (domain logic layer)

Are immutable and cannot be changed. They represent one specific value of the domain but do not depend on each other. Sometimes it makes sense to combine two or three values into one single objects.

Entities: (domain logic layer)

Every value can be changed. Different entities can have the same values. Child entities are directly accessible. Child aggregates are referenced by the id. Entities always depends on other aggregates.

Aggregates: (domain logic layer)

Every aggregate differs from each other. It holds every information which depend on itself, except other aggregates. In this case, it holds an id. Deleting an aggregate means also, deleting all entities as well as value objects.

UI Models: (ui layer)

Only holds display data, which are perfectly prepared for displaying. If needed, including any units (e.g 30m). Can hold data from multiple aggregates.

Converting the data classes

Having so much similar data classes within different layers, has one major disadvantage.

There are three similar data classes within this simple application. IssueDto, Issue (aggregate) and IssueModel. I totally agree that it comes somewhere to the point, where you need to write something like:

IssueDto dto = _repository.GetById(42);

var issue = new Issue();
issue.Id = new IssueId(dto.Id);
issue.Name = new IssueName(dto.Name);
issue.Priority = new IssuePriority(dto.Priority); 
// ...

return issue;

C# users can use implicit/explicit operator overloading and write something like this:

IssueDto dto = _repository.GetById(42);

var issue = new Issue();
issue.Id = dto.Id;
issue.Name = dto.Name;
issue.Priority = dto.Priority; 
// ...

return issue;

But if you do this, you will get a spotless result of your domain. I might look as a good idea to put this conversion into the data class itself, and maybe it is. However, it is not. You will establish a dependency between the layers. The UI will immediately depend on your data layer, which is not a good idea at all.

Anemic Domain Model vs. Rich Domain Model

Firstly, I want to describe these terms.

Anemic Domain Model means, that the data class has only properties without any further business logic.

Rich Domain Model means, you have dozens of methods within your data class. Every property is only accessible via a separate method. Everything what belongs to that model is also included within the model. By using rich domain models, your service layout would be very thin.

Martin Fowler, has called the anemic domain model as an anti-pattern.

The anemic domain model is really just a procedural style design […] many people think that anemic objects are real objects, and thus completely miss the point of what object-oriented design is all about.

Martin Fowler

Okay, either I miss the point, or Martin is completely wrong at this point!

So, if I understand him correctly, he prefers doing stuff like this:

public class Issue
{
    public IssueId Id { get; set; }
    public IssueName Name { get; set; }

    // many other properties here

    public void SaveToDatabase()
    {
        // do some database interactions
    }

    public List<Issue> GetAllSubTasks()
    {
        // get all subtasks from the database
    }

    public TimeValue GetRemainingTime()
    {
        // do some business logic here
    }
}

In this case, Martin (and also his mate Eric Evens) violates nearly every SOLID principle:

[SRP] There should never be more than one reason for a class to change.

Robert C. Martin

Such classes do everything. Hmm, it leads me just to another anti-pattern: God-Object.

[OCP] Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification.

Robert C. Martin

It will also end up with many methods, which really nobody is able to understand. A further anti-pattern: Spaghetti-Code.

[DIP] High-level modules should not depend on low-level modules. Both should depend on abstractions.

Robert C. Martin

Okay, no anti-pattern here (help!) but this kind of code is untestable! Those classes aren’t reusable, so it makes it harder to use them in other projects. But if you stop creating rich domain models, your domain model will become much more flexible.

This unknown author wrote an excellent article, which explain, why rich domain models are terrible.

Logic within data classes

Wait, didn’t you said, rich domain models are terrible?

Yep 🙂

As a rule of thumb, you should put as few methods into data classes as possible. The best case would be zero.

You can always put such methods, which only depend on the current object. Never put any (logic class) interfaces inside your data classes.

public class Issue
{
    // some properties

    // terrible
    public void SaveToDatabase() { }
    public List<Issue> GetAllSubTasks() { }
    
    // bad 
    public bool IsValid() { }
    public void Validate() { }

    // ok
    public TimeValue GetRemainingTime() { }
    public void ChangeAssignee(User assignee) { }
    public void ChangeAssignee(UserId assignee) { }
}

I have put the categories of logic into here

External dependency to other components (terrible)

Whatever you do, avoid dependencies to any logic classes. Especially it comes to the database. If you do this, you cannot reuse this data class in any other application. You will not be able to write any kind of unit test. You will not be able to exchange the database.

Trust me, sooner or later you want to exchange the data storage. At this point you will look at your rich domain model and just think “Damn, who the f*ck wrote that sh*tty code? Oh, it was me.”

External dependency to itself (bad code)

An object shouldn’t validate itself. In most cases, you need some more information to tell if the current isntance is valid. It is also possible to use external frameworks to validate the instance.

This is actually not the responsibility of the data class. Furthermore, a data class can be valid in one situation and invalid in another.

Internal dependency (ok code)

Such kind of methods, should only contain dependencies to other data classes. Usually, such kind of methods would end up within very thin components, or static methods or extension methods or something else. You can put it directly into the data class instead.

Leave a Reply

Your email address will not be published. Required fields are marked *