Home
Omar Hosney
LinkedIn

πŸ“Š bupaR Cheat Sheet

Business Process Analytics in R β€’ bupaverse ecosystem

πŸ“‹ Event Log Components

REQUIRED case_id – Unique process instance
REQUIRED activity_id – Activity name/label
REQUIRED activity_instance_id – Unique event ID
REQUIRED timestamp – When event occurred
OPTIONAL lifecycle_id – start/complete status
OPTIONAL resource_id – Who performed it

πŸ“– Performance Metrics Defined

⏱️ Throughput Time

Total time from case start to end. Includes all waiting + working time. Measures end-to-end duration.

βš™οΈ Processing Time

Actual working time spent on activities. Excludes waiting/idle periods. Sum of all activity durations.

πŸ’€ Idle Time

Time between activities where no work happens. Waiting time = Throughput βˆ’ Processing time.

Act 1 Act 2 Act 3 Act 4 β–  Processing β–  Idle ← Throughput Time β†’

πŸ“¦ Setup

library(bupaR)core
library(bupaverse)all pkgs
library(processmapR)viz

ℹ️ Log Info

summary(log)overview
mapping(log)field map
case_id(log)case col
activity_id(log)act col

πŸ”§ Creating Event Log from DataFrames

orders.csv order_id, date customer, status actions.csv action, timestamp mutate() + merge eventlog() case_id = "order_id" activity_id = "action" timestamp = "date" ...
df %>% mutate(case_id = paste(id1, id2, sep="-")) %>%
  eventlog(case_id="case_id", activity_id="act", timestamp="ts", ...)

πŸ”’ Counts

n_cases(log)# cases
n_activities(log)# acts
n_traces(log)# traces
n_resources(log)# res

🏷️ Labels

activity_labels(log)acts
resource_labels(log)res
activities(log)df
resources(log)df

πŸ”€ Traces

traces(log)all traces
trace_length(log)lengths
trace_coverage(log)coverage
trace_explorer(log,coverage=.8)viz

πŸ—ΊοΈ Process Maps

process_map(log)basic
process_map(type=frequency())freq
process_map(type=performance())perf
resource_map(log)resources

⏱️ Performance

throughput_time(log)total time
processing_time(log)work time
idle_time(log)wait time
units = "days"/"hours"/"mins"

πŸ‘₯ Resources

resource_frequency(log)freq
resource_involvement(log)cases
level="resource"/"activity"/"resource-activity"

πŸ“ Structure

start_activities(log)starts
end_activities(log)ends
activity_presence(log)presence
number_of_repetitions(log)rework

↔️ Precedence

precedence_matrix(log)matrix
type="absolute"/"relative"
process_matrix(type=frequency())proc

⏰ Filter: Throughput Time

filter_throughput_time(log, interval=c(5,10), units="days")range
filter_throughput_time(log, percentage=0.5)fastest 50%
filter_throughput_time(log, percentage=0.5, reverse=TRUE)slowest 50%
filter_throughput_time(log, interval=c(5,NA), units="days")>5 days

🎯 Filter: Activity

filter_activity(log, activities=c("A","B"))labels
filter_activity_frequency(log, percentage=0.6)freq
filter_activity_presence(log, c("A"), method="none")absent

πŸ”— Filter: Precedence

filter_precedence(log, antecedents="A", consequents="B")flow
precedence_type="eventually_follows"/"directly_follows"
filter_method="all"/"none"

πŸ” Filter: Trace

filter_trace_frequency(log, percentage=0.2)top 20%
filter_resource_frequency(log, interval=c(2,5))res range
filter_time_period(log, interval, filter_method="trim")time

πŸ“¦ Aggregate

act_unite(log,"New"=c("A","B"))rename
act_collapse(log,"New"=c("A","B"))merge

✨ Enrich

group_by_case(log)by case
mutate(x = sum(cost))add var
throughput_time(level="case", append=TRUE)add metric

πŸ“Š Visualize

plot()pipe any metric
dotted_chart(log, x="relative")timeline
x="absolute"/"relative"/"relative_day"

πŸ“ Granularity Levels

level = "log" | "trace" | "case"
       | "activity" | "resource"
       | "resource-activity"

βœ‚οΈ Slice & Dice

slice(log, n)nth case
filter(col == "val")dplyr
group_by(col)segment