In this context, I will consider the issue of the data representation: a too coarse data aggregation leads to too simplistic models, but the integration of highly detailed data sources yields models that are less transparent and general. I will consider various coarse-grained representations of the contact patterns occurring in a hospital ward. The simulations of disease spreading models in this community shows how the usual contact matrix representation, that only contains average contact durations between role classes, fails to reproduce the size of the epidemic obtained using the high-resolution contact data and to identify the most at-risk classes. I will introduce a contact matrix of probability distributions that takes into account the heterogeneity of contact durations between (and within) classes of individuals and show that it yields a good approximation of the epidemic spreading properties obtained by using the high-resolution data.