Skip to content

feat: add workflow macro UI support#5179

Open
carloea2 wants to merge 4 commits into
apache:mainfrom
carloea2:macros
Open

feat: add workflow macro UI support#5179
carloea2 wants to merge 4 commits into
apache:mainfrom
carloea2:macros

Conversation

@carloea2
Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

image

Implements workflow macros: a way to group canvas operators into a named, collapsible container, seeded by importing another workflow.

How it works

  • The user drag and drop a virtual "Workflow Macro" operator. Then in the property panel, the user pick any of their workflows. On selection, the workflow's operators and links are copied (snapshot) into the macro with fresh IDs, there's no live link back to the source. The macro keeps the source's id/name only as a display label.
  • Collapsed mode hides the internal operators and replaces them with a single macro node on the canvas; external links that crossed into the macro are proxied through that node so the graph remains connected. Expanded mode shows internals inline within the frame.
  • Macros are pure UI grouping. They round-trip through workflow persistence (WorkflowContent.macros[]) and across collaborative editing, but the execution engine sees the underlying operator graph unchanged — macroIdParent is metadata only.

Scope of changes

  • Backend (common/workflow-operator): adds optional macroIdParent to LogicalOp and hides it from the auto-generated property form.
  • Frontend types: new WorkflowMacro interface; WorkflowContent carries an optional macros[].
  • Frontend graph model: macro framing, collapse/expand visuals, proxy links, macro-aware auto-layout, collaborative sync.
  • Frontend UI: new macro side panel; context menu gains create macro / remove from macro; macro-aware delete/paste.

Any related issues, documentation, discussions?

Closes #5178.

How was this PR tested?

New unit specs covering macro ID helpers, frame geometry, internal auto-layout, macro CRUD + undo/redo bundling, create/remove predicates, macro-aware context menu, and execution payload shape:

  • joint-graph-wrapper.spec.ts (+111)
  • workflow-action.service.spec.ts (+99)
  • operator-menu.service.spec.ts (+38)
  • context-menu.component.spec.ts (+6)
  • execute-workflow.service.spec.ts (+10)

Manual canvas testing:

  1. Drop macro → pick a workflow → its operators copy in with fresh IDs.
  2. Toggle collapse/expand → internals hide/show, external links proxy through the macro node.
  3. Drag macro → children translate together; no spurious position events.
  4. Remove from macro on a child → macroIdParent cleared, frame reflows.
  5. Delete macro

Was this PR authored or co-authored using generative AI tooling?

No

Texera.-.Google.Chrome.2026-05-24.04-22-18.compressed-under-10mb-1.5x.mp4

@github-actions github-actions Bot added frontend Changes related to the frontend GUI common labels May 24, 2026
@carloea2
Copy link
Copy Markdown
Contributor Author

@mengw15 can I get your help to decide a split for this PR?

@carloea2
Copy link
Copy Markdown
Contributor Author

Thanks

carloea2 and others added 2 commits May 24, 2026 04:59
Remove the context-menu "create macro" action together with its
supporting service methods (canCreateMacroFromHighlightedOperators,
createMacroFromHighlightedOperators, createMacroForOperators, and the
three private position helpers it pulled in). Empty macros can still
be created via toolbox drag-drop or operator search; "remove from
macro" stays in the context menu. Test rewritten to set up macro
membership via createMacroAt + setOperatorsMacroParent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 24, 2026

Codecov Report

❌ Patch coverage is 8.33333% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 43.71%. Comparing base (0a42fcd) to head (c538fdd).

Files with missing lines Patch % Lines
.../operator/metadata/OperatorMetadataGenerator.scala 0.00% 11 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##               main    #5179   +/-   ##
=========================================
  Coverage     43.71%   43.71%           
- Complexity     2217     2218    +1     
=========================================
  Files          1049     1049           
  Lines         40578    40537   -41     
  Branches       4327     4321    -6     
=========================================
- Hits          17739    17722   -17     
+ Misses        21733    21711   -22     
+ Partials       1106     1104    -2     
Flag Coverage Δ *Carryforward flag
access-control-service 39.53% <ø> (ø)
agent-service 33.76% <ø> (ø) Carriedforward from 655ae45
amber 44.18% <8.33%> (+0.01%) ⬆️
computing-unit-managing-service 0.00% <ø> (ø)
config-service 0.00% <ø> (ø)
file-service 32.18% <ø> (ø)
frontend 35.15% <ø> (-0.03%) ⬇️ Carriedforward from 655ae45
python 90.50% <ø> (ø) Carriedforward from 655ae45
workflow-compiling-service 56.81% <ø> (ø)

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mengw15
Copy link
Copy Markdown
Contributor

mengw15 commented May 24, 2026

@mengw15 can I get your help to decide a split for this PR?

Thanks for the PR! I think this is a pretty important feature for Texera, so it may be worth having a broader discussion to align on the design and direction before moving forward.

I also remember that Xiaozhen worked on a related macro feature during the hackathon, for reference: #5115

cc @chenlica @Yicong-Huang @Xiao-zhen-Liu @aglinxinyuan

@aglinxinyuan
Copy link
Copy Markdown
Contributor

aglinxinyuan commented May 24, 2026

The direction is correct. We just need to split the backend part as a separate PR. The frontend part will be larger than regular PRs, but I think it's already the minimum unit and nonbreakable. It's manageable and reviewable.

@Yicong-Huang
Copy link
Copy Markdown
Contributor

Thanks for the PR! It's great to see multiple designs of macros. We definitely need this feature!

Let's talk about the experience before diving into code.

create a macro

It is not shown or clear to me how to create a macro from scratch?

In @Xiao-zhen-Liu 's version (#5115), user highlights three operators and right click to show the context menu, then click "create Marco", then the three operators disappear and replaced with a macro operator. The Marco operator is same shape/size of other normal operators.

I do like this experience.

Representation of a macro and reuse a macro operator

In this version, user drags a macro operator (which is a marker, instead of an operator), and selects options from a drop down menu inside property panel.

It is not clear how to reuse a marco in #5115.

I would prefer not to have Marco as a marker. instead, we have always been discussing that a Marco operator should just be the same as other logical operators. For instance, you can view Hash Join (we should rename it to Join) operator as a macro. It gets expanded into build and probe operators later after compile. I will be great if Marco operators can just look similar to other normal operators (like inside #5115). Most importantly, Marco operators should have ports, that is derived from the sub-dag that it is representing. For instance, if a macro operator contains a sub-dag that has 2 inputs, and 3 output branches, then it will be good for the Marco to have 2 input ports and 3 output ports. This way user can use the macro operator without knowing its internal shape. During execution, user can see the macro's aggregated statistics instead of internal operators' statistics.

Display the macro internal sub-Dag and edit it

In this version, the macro's content (internal sub-DAG) is displayed inplace. It is more like the grouping view Zuozhi implemented before in #754. And user can drag and drop a new operator into existing macro to edit it. I assume user can also delete one operator or reconnect an edge between operators.

In Xiaozhen's #5115, user needs to click on the macro to jump to another workflow view, where it expands into the internal dag. The internal view has a dummy source operator and a dummy sink operator for user to understand. I think this experience is also mimicking rapidminer/knime. It is not clear if user can edit the internal sub-DAG in that view.

I do like the experience to jump to another workflow view to edit Marco's internal shape, because usually once a macro is created, it will be likely to be reused by many workflows. To edit a macro in place of a particular workflow would also affect the same macro being used in other workflows (think of modifying a function's definition at its callsite, other invocations places of the same function would also be affected). The marco is better to be edited in a stand alone view.

However I do not like the dummy source and dummy sink in #5115. I think we should prioritize ports and just show input and output ports are sufficient.

Other ideas

  • I think it maybe good idea to show "My macros" in the operator properties directly and let users drag a Marco operator similarly as dragging a normal operator out. This can save the changes on the property panel. Think about this: how many times would user select a marco (marker) and uses the drop down to change it to another macro implementation? I think if in any cases user need to use another macro, they should just drag it out from the operator menu.

Ok those are all my high level comments regarding this PR. I am sure we will have much more detailed comments going deeper.

So let's talk about breaking into smaller PRs. I still think this PR is too large: it has too many design choices that include creating a Marco, reuse a previously created macro, representation of a macro, editing a Marco, runtime behavior of the Marco. There are other life cycles which are not clear or being discussed: removing of a Marco, disable a macro, backend compilation of macro, persistence of macro (how to represent it in DB), search of a Marco (indexing) .... There are also other things we need to change, such as documentation, tests, tutorials of how to use Marco, etc.

The right size of a PR, is we can discuss and review for a single or two design choices. Making the scope smaller can make the discussion easier. For instance, I would recommend you have one PR about creating a macro (without saving it or reuse it), then we can discussion about the experience of creating a macro in the PR. Then you can have a PR to persist the Marco into db, we can discuss where to store macro information and how to make sure search can index it. Then you can have a PR to introduce experience of how user can reuse a previously stored macro in a new workflow. Then another PR to edit a previously macro. For each feature, you can split into frontend, backend to make size even smaller. I think you can see my point now. Making PRs smaller is not the goal: it is because smaller PRs can get proper reviews and discussions on your changes.

You may worry about that: these are partial features, can we merge into main? yes we can, just give a config that macro feature is turned off by default in main, so it doesn't affect any other features. After we have more PRs to implement the whole life cycle, fix bugs, add documentations, we then can turn on the feature by default to let user see and use it.

I hope this make sense to you!

@Yicong-Huang
Copy link
Copy Markdown
Contributor

BTW, @Xiao-zhen-Liu do you want to merge your version #5115 into main? I do like some of its experience. It will be great if we can unify those two experiences...

@carloea2
Copy link
Copy Markdown
Contributor Author

carloea2 commented May 24, 2026

@Yicong-Huang

The context menu to create a macro is just for convenience I believe, we can add it here as well as a separate PR maybe.

I do not think a macro should be a real operator, instead it should just act as an special UI marker to collapse/expand the operators inside it. The Hash Join follows a different purpose because it uses the compilation, a macro will not need to be especially treated in compilation phase.

Also please check the next picture, I believe in this PR, when a macro is collapsed, for the user it looks like a normal operator, including the virtual input/output ports, in this PR they are automatically derived for the user.

image

Regarding the aggregated statistics, I am not sure which one is better, maybe letting the user choose what statistics to see? I believe user will be more interested in leaf operator statistics inside the macro.

Also in this implementation, user selects a workflow the user has access to, and then import it as a snapshot, so it does not keep synced with the other workflow anymore (we can decide if we want this or not, either way can be implemented)

I do not like the idea of creating "My macros", I think a macro should just represent an internal workflow, and from my perspective, it feels natural for an user to select it from the workflows it has access to, same as when the user select a file from a dataset.

This PR is not modifying any DB schema, so there is no need for indexes, etc. It is mainly frontend changes with 1 property addition in the backend for operators.

Finally #5115 has 5x more LOC than this PR. (For sure @Xiao-zhen-Liu has more features, and AI-augmented features?)

@Yicong-Huang
Copy link
Copy Markdown
Contributor

Let's break them down to discuss one by one. It is not efficient to discuss all those design choices together.

I do hope to make hashjoin a macro operator in the long run. It should not require compilation to expand it: that was a tech debt.

@aglinxinyuan
Copy link
Copy Markdown
Contributor

@Yicong-Huang, @carloea2

There are many design decisions to make, but there is one particularly important choice in this PR that I strongly prefer over #5115: all macro-related features should be implemented purely on the frontend. The backend should not compile macros, nor should it even be aware of the existence of macros.

By keeping the implementation entirely on the frontend, we can significantly simplify the architecture. That is also why this PR has roughly 5× less code than #5115.

I don’t have strong opinions on the PR size or the number of PRs, but I strongly recommend keeping the implementation frontend-only.

@carloea2
Copy link
Copy Markdown
Contributor Author

Agree

@Yicong-Huang
Copy link
Copy Markdown
Contributor

I also prefer macro to be expanded in the frontend. same logic: I hope the design can capture hash join as well.

I am not looking at code yet. so line of change does not matter. I hope to discuss on the design/user experience first.
Can we move the discussion to issues instead of on particular PR? you can always refer to this PR for implementation. also it might be a good idea to break down the issues for discussions.

@chenlica
Copy link
Copy Markdown
Contributor

This is a very important and powerful feature. I suggest we do a live discussion about the details and report the decision here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common frontend Changes related to the frontend GUI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add workflow macros for importing, grouping, and collapsing reusable sub-workflows

6 participants