This post is part of my Dev rules of thumb series. In them, I leave you with a poster with a few rules I follow for each concept, which you can print out and scatter around your office, and a brief explanation of what that pattern is about. You can also find a PDF of the poster at the bottom of this post, for better quality printing.
Data Transfer Objects (DTOs) are objects used only to pass some data around as a whole, as opposed to several individual variables.
This makes the client code simpler, as it only needs to deal with one conceptual unit of code, and methods receiving the data will have fewer parameters and better typed ones.
Since the goal is simply to move data around, they don’t really need any business logic, only the minimal logic necessary for client code to create such DTOs with an initial set of data, and to be able to later retrieve their data.
So, a DTO is created with a set of data and is passed around. Would it make sense for the emitter of the DTO to send a set of data and for the consumers to receive a different set of data?! No. It would completely defeat the purpose of a DTO. So a DTO should never be changed, it must be immutable.
I usually prefer to have no validation within the DTOs, and leave that to the object receiving the DTO, unless the language is loosely typed, in which case I will validate the types. Usually, as long as the properties types are correct the DTO is in a valid state. However, a DTO might be in a valid state for a receiver, but in an invalid state for another receiver. It’s the receiver that must validate if it can perform its logic with that set of data, or not.
Despite not used with persistence in mind, they do need to be serializable so that, if needed, they can be put in a queue and sent to other applications. For this reason, they can not contain complex data structures, namely circular references, objects who might be connected to several other objects in a huge data graph that could yield a huge payload, or entities which could, on deserialization, be considered by an ORM as a new entity instead of a new version of an entity the ORM already has in cache.
Martin Fowler, who wrote about the DTO pattern in his book “Patterns of Enterprise Application Architecture“, has mentioned that “their whole purpose is to shift data in expensive remote calls” in an article from 2004. However, in the same article he also mentions “One case where it is useful to use something like a DTO is when you have a significant mismatch between the model in your presentation layer and the underlying domain model.” which is not necessarily a “remote” scenario. Furthermore, at the end of the same article, he gives yet another use case for DTOs: “communicating between isolates in multi-threaded applications” (again, not a remote scenario). I believe that what Martin Fowler intents to alert against is the overuse of DTOs, always using DTOs to move data through every layer of the application brings unnecessary boilerplate and blurs the expressiveness of the domain representation.
Personally, I mostly think of DTO when I have a query object that returns some data that a template needs. Its a purely presentation need, there is no domain need involved, and it is local (fits the 2nd scenario Martin Fowler refers to).
Nowadays, I often use Envelopes, Commands and Events, which I see as specializations of DTOs. They are DTOs with a specific role (maybe remote, maybe not). So I only refer to them as DTOs when explaining to other developers what they are.
Back in 2004, we were still very much tied to a layered architecture style, where objects could know about persistence and serialization mechanisms, so the pattern would include in a DTO a method to serialize such objects. This is still valid in such an architecture style. However, nowadays we also use concentric layered architecture styles like Clean Architecture. In such scenario the DTO is unaware of any serialization mechanism and thus does not need a serialization method.
Here’s a simple example of a DTO: