- Test your system using chaos strategies.
- Have a plan to at least partially fulfill your service promise in case of a fault, whether it is a canned message or calling a different service as a backup.
- Be conservative in what you send to a service and liberal in what you accept. Your unmarshalling logic should do just enough validation of responses and pull out just the data you need rather than executing a full validation.
- When services fail, multiple iterations of the same message should not add inconsistencies to your users’ systems.
- Use infrastructure that filters out duplicates. Or, track unique identifiers in the messages and discard those that are already processed successfully.
Full post here, 7 mins read