Workflow management systems (WFMSs) that are geared for the orchestration of enterprise-wide or even “virtual-enterprise”-style
business processes across multiple organizations are complex distributed systems. They consist of multiple workflow engines,
application servers, and ORB-style communication servers. Thus, deriving a suitable configuration of an entire distributed
WFMS for a given application workload is a difficult task.
This paper presents a mathematically based method for configuring a distributed WFMS such that the application’s demands regarding
performance and availability can be met while aiming to minimize the total system costs. The major degree of freedom that
the configuration method considers is the replication of the underlying software components, workflow engines and application
servers of different types as well as the communication server, on multiple computers for load partitioning and enhanced availability.
The mathematical core of the method consists of Markov-chain models, derived from the application’s workflow specifications,
that allow assessing the overall system’s performance, availability, and also its performability in the degraded mode when
some server replicas are offline, for given degrees of replication. By iterating over the space of feasible system configurations
and assessing the quality of candidate configurations, the developed method determines a configuration with near-minimum costs.
This work was performed within the research project “Architecture, Configuration, and Administration of Large Workflow Management
Systems” funded by the German Science Foundation (DFG).