• Test your system using chaos strategies.
  • Have a plan to at least partially fulfill your service promise in case of a fault, whether it is a canned message or calling a different service as a backup.
  • Be conservative in what you send to a service and liberal in what you accept. Your unmarshalling logic should do just enough validation of responses and pull out just the data you need rather than executing a full validation.
  • When services fail, multiple iterations of the same message should not add inconsistencies to your users’ systems.
  • Use infrastructure that filters out duplicates. Or, track unique identifiers in the messages and discard those that are already processed successfully.

Full post here, 7 mins read