- Added
Florence2 Segmentationnode for Florence-2 tasks: polygon masks, phrase grounding (boxes), and region proposals. - Added
Florence2 To Coordinatestool node to convert Florence-2 JSON into center coordinates, bounding boxes, and masks.
- Added
YoloV8/YoloV8Advnodes for YOLOv8 detection, producing annotated images, merged masks, and mask lists.
- Added
ColorToMasknode to generate masks from a target color with threshold and invert options.
- Added
ImageToListNode: Combines up to 6 images into a batch with optional resize modes: off, fit, crop. - Added
MaskToListNode: Converts a batch of masks into a mask list. - Added
ImageMaskToListNode: Converts a batch of images and masks into an image and mask list.
- Added
ImageResizeNode: Comprehensive all-in-one image resizing tool with robust handling for most scenarios. Supports custom width and height, megapixel constraints, longest/shortest side resizing, padding, cropping, and additional flexible options. - Enhanced the
Comparenode by adding support for bg_color and text_color properties. These improvements are now applicable for both side-by-side image comparison and video comparison.
- Updated
SAM3 Segmentationnode: added output mode (merged/separate), max segment, segment_pick, and device controls.
- Removed global torch.load override; TorchScript handled locally in SAM2.
- TorchScript is handled via a local fallback to avoid interfering with other nodes.
- Improves overall compatibility and stability in mixed ComfyUI environments.
triton-windowsfor proper SAM2, SAM3 model execution on Windows platforms.ultralyticsRequired for YOLO node.
Note
YOLO nodes require the optional ultralytics package. Install it only if you need YOLO to avoid dependency conflicts: ./ComfyUI/python_embeded/python -m pip install ultralytics --no-deps.
- Rebuilt ImageCompare node with enhanced features
- Added support for 3 images (previously 2)
- Added size_base parameter: choose largest, smallest, or specific image as reference
- Added customizable text_color and bg_color parameters
- Updated our nodes to match the latest ComfyUI V3 schema changes.
- Fixed compatibility issues affecting multiple nodes, including
ImageCompositeMasked. - Thanks to reports in #132 and #146
- Added model unload for SAM3 segmentation.
- Helps free memory after each run and improves long-session stability.
- Thanks to contribution and feedback from #147
- Bug fix: SAM3 Segmentation CPU mode no longer crashes from mixed cuda/cpu tensors when a GPU is present. (#135)
- Added missing dependency
decordto requirements.txt. (#136)
- Added
SAM3 segmentationnode with Meta’s latest SAM3 segmentation model

SAM3Segment: RMBG-focused text segmentation using the officialSAM3Model checkpoint- Sharper edges and faster inference versus SAM2 in our tests; supports FP32/FP16 autocast on CUDA
- Alpha/Color background output, mask blur/offset/invert, plus RGB mask image for quick compositing
ComfyUI-rmbg_v2.9.4.mp4
- Bug Fix: The latest ComfyUI update caused an issue with the
colorwidget. We have addressed the problem and updated all related nodes. The widget now functions correctly. (User Reported #118 )
- Added
BiRefNet_toonOutgeneral purpose model (balanced performance) (User request #110 )
ImageStitchNode Updates: Migrated to the latest architecture. Now supports 4-image input with a new 2x2 stitching mode. Automatically applies smart kontext_mode when 4 images are provided. Output layout configured as 3 images on the left and 1 on the right. Added support for magepixel and new upscaling methods.
-
Refactored
LoadImage&LoadImageAdvancedNodes- Reworked resizing logic for more powerful and intuitive control.
- New execution priority:
megapixelshas the highest priority, otherwisesizeandscale_bynow work together in a pipeline. - Improved image quality by calculating the final target size first and performing only a single
resizeoperation to prevent quality loss.
-
Added
METADATA_TEXTOutput toLoadImageAdvanced- The
LoadImageAdvancednode now outputs the embedded generation parameters from AI-generated PNG files (e.g., prompts, model, seed). - This allows for easy workflow replication by connecting the metadata directly to text inputs.
- The
-
Enhanced the RMBG node to optimize batch processing of images and videos, (User request #100 )
-
Reconstructed the
ColorWidgetto improve stability and prevent potential freezes in certain ComfyUI configurations.
-
Added
SDMatte Mattingnode (User request #99 )

-
Optional
maskinput; if omitted and the input image has an alpha channel, the alpha is used as the mask -
Unified explicit bilinear resizing for inputs/outputs; improved consistency with other nodes
-
Inference optimizations:
torch.inference_mode, CUDA FP16 autocast, memory cleanup, and explicit GPU fallback messaging
- Added SAM2 segmentation nodes with latest Facebook Research SAM2 technology
SAM2Segment: Text-prompted segmentation with 4 model variants (Tiny/Small/Base Plus/Large)- Improved accuracy and faster processing compared to SAM V1
- FP16/FP32 precision support and better edge detection
- Enhanced color widget support across all nodes
- Fixed color picker functionality and improved color handling consistency
- Updated SAM2 model integration with optimized memory usage and batch processing
- Bug Fixed
- Bug fixes and improved code compatibility
-
Enhanced LoadImage node with direct URL and path support

- Added image_path_or_URL parameter for loading images from local paths or URLs
- Improved URL handling with User-Agent support for better compatibility
- Maintained compatibility with traditional file selection
- Simplified workflow for external image sources
- Three different LoadImage nodes for different purposes and needs:
LoadImage: Standard image loader with commonly used options, suitable for most workflowsLoadImageSimple: Minimalist image loader for quick and basic image loadingLoadImageAdvanced: Advanced image loader with extended configuration for power users
-
Completely redesigned
ImageStitchnode with advanced features
- Compatible with ComfyUI's native image stitch functionality
- Added support for 3-image stitching with kontext_mode
- Improved spacing and background color options
- Added maximum size constraints for output images
- Enhanced image matching and padding options
- Better handling of different image sizes and aspect ratios
- Included commonly requested user settings for more flexibility
- Fixed background color handling across all nodes
- Resolved errors reported by users when using color picker
- Fixed color application in segmentation and background removal nodes
- Improved color consistency across different operations
- Added the first RMBG inpainting tool for the Flux Kontext model: the
ReferenceLatentMasknode, which leverages a reference latent and mask for precise region conditioning. (Stay tuned, more tools will be released in future updates.) - Updated RMBG
LoadImagenode: added an upscaling method for improved output quality, refined image output to RGB format, and optimized the alpha channel in the mask output..
- Fixed the missing BiRefNet Models
- Introduced the
MaskOverlaynode, enabling mask overlays directly on images.
- Added
ImageMaskResizenode for resizing image and mask with various options.
- Implemented the LamaRemover node for object removal using the LaMa model. For a more advanced object removal solution, see our companion project: ComfyUI-MiniMax-Remover

- Added 2 BiRefNet models:
BiRefNet_lite-mattingandBiRefNet_dynamic - Added batch image support for
Segment_v1andSegment_V2nodes
- Added
CropObjectnode for cropping to object based on mask or alpha channel (User request #61 ) - Added
ImageComparenode for side-by-side image comparison with annotations - Added
ColorInputnode pick preset color or input RGB color code in #000000 or #000 format (User request #62 ) - Updated
MaskExtractornode added color picker and support RGBA images by extracting and using the alpha channel as mask - Updated
ImageCombinernode added WIDTH and HEIGHT output
- Uses Hugging Face transformers library
- Better compatibility with newer PyTorch (2.x) and CUDA versions
- Recommended for users with modern GPU setups
- No groundingdino-py dependency required
(User request #66 )
- Uses original groundingdino-py implementation
- May have compatibility issues with newer PyTorch/CUDA versions
- Consider using V2 if you encounter installation issues
Choose the appropriate version based on your setup:
- For modern systems (PyTorch 2.x, CUDA 12.x+), use Segment V2
- For legacy systems or if you specifically need groundingdino-py, use Segment V1
- Added support for more segmentation models in Segment node:
- SAM HQ models (vit_h, vit_l, vit_b)
- Changed background color input to color picker for better color selection
- Updated and standardized
i18nformat for all nodes, improving multilingual compatibility and fixing some translation display issues - Added node appearance style options, allowing customization of node appearance in the ComfyUI graph for better visual distinction and user experience
- Enhanced ICLoRA Concat node to fully support the native ComfyUI Load Image node, addressing previous limitations with mask scaling. ICLoRA Concat is now compatible with both the RMBG and native image loaders.
- Added
Image Cropnode: Flexible cropping tool for images, supporting multiple anchor positions, offsets, and split output for precise region extraction. - Added
ICLoRA Concatnode: Enables mask-based image concatenation with customizable direction (left-right or top-bottom), size, and region, suitable for advanced image composition and layout. - Added resizing options for Load Image: Longest Side, Shortest Side, Width, and Height, enhancing flexibility.
- Fixed an issue where the preview node did not display images on Ubuntu.
- Bug Fixed
- Added the following nodes:
Image Combiner: Image Combiner, used to merge two images into one with various blending modes and positioning options.Image Stitch: Image Stitch, used to stitch multiple images together in different directions (top, bottom, left, right).Image/Mask Converter: used for converting between images and masks.Mask Enhancer: an independent node for enhancing mask output.Mask Combiner: Mask Combiner, used to combine multiple masks into one.Mask Extractor: Mask Extractor, used to extract masks from images.
- Fixed compatibility issues with transformers version 4.49+ dependencies.
- Fixed i18n translation errors in multiple languages.
- Added mask image output to each segment nodes, making mask output as images more convenient.
Enhanced compatibility with Transformers
- Added support for higher versions of the transformers library (≥ 4.49.0)
- Resolved conflicts with other models requiring higher version transformers
- Improved error handling and more user-friendly error messages
- If you encounter issues, you can still revert to the recommended version:
pip install transformers==4.48.3
The integration of internationalization (i18n) support significantly enhances ComfyUI-RMBG, enabling users worldwide to utilize background removal features in their preferred languages. This update fosters a more tailored and efficient workflow within ComfyUI-RMBG. The user interface has been improved to facilitate dynamic language switching according to user preferences. All newly introduced features are designed to be fully translatable, thereby improving accessibility for users who do not speak English.
Custom Nodes i18n UI |
|---|
| English, 中文, 日本語, Русский, 한국어, Français |
ComfyUI-rmbg_i18n.mp4
- Added Load Image, Preview Image, Preview Mask, and a node that previews both the image and the mask simultaneously. This is the first phase of our toolset, with more useful tools coming in future updates.
- Reorganized the code structure for better maintainability, making it easier to navigate and update.
- Renamed certain node classes to prevent conflicts with other repositories.
- Improved category organization with a new structure: 🧪AILab/🛠️UTIL/🖼️IMAGE, making tools easier to find and use.
- Integrated predefined workflows into the ComfyUI Browse Template section, allowing users to quickly load and understand each custom node’s functionality.
- Optimized utility functions for image and mask conversion
- Improved error handling and code robustness
- Updated and changed some variable names for consistency
- Enhanced compatibility with the latest ComfyUI versions
- Clean up the code and fix the transformers version issue
transformers>=4.35.0,<=4.48.3
- Added Fast Foreground Color Estimation feature
- New
refine_foregroundoption for optimizing transparent backgrounds - Improved edge quality and detail preservation
- Better handling of semi-transparent regions
- New
- Added OpenCV dependency for advanced image processing
- Enhanced foreground refinement algorithm
- Optimized memory usage for large images
- Improved edge detection accuracy
- Changed repository for model management to the new repository
- Reorganized models files structure for better maintainability
Add and group all BiRefNet models collections into BiRefNet node.
- Added
BiRefNetgeneral purpose model (balanced performance) - Added
BiRefNet_512x512model (optimized for 512x512 resolution) - Added
BiRefNet-portraitmodel (optimized for portrait/human matting) - Added
BiRefNet-mattingmodel (general purpose matting) - Added
BiRefNet-HR model(high resolution up to 2560x2560) - Added
BiRefNet-HR-mattingmodel (high resolution matting) - Added
BiRefNet_litemodel (lightweight version for faster processing) - Added
BiRefNet_lite-2Kmodel (lightweight version for 2K resolution)
- Added FP16 (half-precision) support for better performance
- Optimized for high-resolution image processing
- Enhanced memory efficiency
- Maintained compatibility with existing workflows
- Simplified model loading through Transformers pipeline
** (To ensure compatibility with the old V1.8.0 workflow, we have replaced this image with the new BiRefNet Node) (2025/03/01)
- Added support for BiRefNet High Resolution model
- Trained with 2048x2048 resolution images
- Superior performance metrics (maxFm: 0.925, MAE: 0.026)
- Better edge detection and detail preservation
- FP16 optimization for faster processing
- MIT License for commercial use
** (To ensure compatibility with the old V1.8.0 workflow, we have replaced this image with the new BiRefNet Node) (2025/03/01)
- Added FP16 (half-precision) support for better performance
- Optimized for high-resolution image processing
- Enhanced memory efficiency
- Maintained compatibility with existing workflows
- Simplified model loading through Transformers pipeline
- BiRefNet-HR vs other models:
- Higher resolution support (up to 2048x2048)
- Better edge detection accuracy
- Improved detail preservation
- Optimized for high-resolution images
- More efficient memory usage with FP16 support
- Added support for BEN2 (Background Elimination Network 2)
- Improved performance over original BEN model
- Better edge detection and detail preservation
- Enhanced batch processing capabilities (up to 3 images per batch)
- Optimized memory usage and processing speed
- Updated model repository paths for BEN and BEN2
- Switched to 1038lab repositories for better maintenance and updates
- Maintained full compatibility with existing workflows
- Implemented efficient batch processing for BEN2
- Optimized memory management for large batches
- Enhanced error handling and model loading
- Improved model switching and resource cleanup
- BEN2 vs BEN:
- Better edge detection
- Improved handling of complex backgrounds
- More efficient batch processing
- Enhanced detail preservation
- Faster processing speed
- Added a new custom node for face parsing and segmentation
- Support for 19 facial feature categories (Skin, Nose, Eyes, Eyebrows, etc.)
- Precise facial feature extraction and segmentation
- Multiple feature selection for combined segmentation
- Same parameter controls as other RMBG nodes
- Automatic model downloading and resource management
- Perfect for portrait editing and facial feature manipulation
- Added a new custom node for fashion and accessories segmentation.
- Capable of identifying and segmenting various fashion items such as dresses, shoes, and accessories.
- Utilizes advanced machine learning techniques for accurate segmentation.
- Supports real-time processing for enhanced user experience.
- Ideal for fashion-related applications, including virtual try-ons and outfit recommendations.
- Support for gray background color.
- Added intelligent clothes segmentation functionality
- Support for 18 different clothing categories (Hat, Hair, Face, Sunglasses, Upper-clothes, etc.)
- Multiple item selection for combined segmentation
- Same parameter controls as other RMBG nodes (process_res, mask_blur, mask_offset, background options)
- Automatic model downloading and resource management
- Enhanced background handling to support RGBA output when "Alpha" is selected.
- Ensured RGB output for all other background color selections.
- Fixed an issue with mask processing when the model returns a list of masks.
- Improved handling of image formats to prevent processing errors.
- Text-Prompted Intelligent Object Segmentation
- Use natural language prompts (e.g., "a cat", "red car") to identify and segment target objects
- Support for multiple object detection and segmentation
- Perfect for precise object extraction and recognition tasks
- SAM (Segment Anything Model)
- sam_vit_h: 2.56GB - Highest accuracy
- sam_vit_l: 1.25GB - Balanced performance
- sam_vit_b: 375MB - Lightweight option
- GroundingDINO
- SwinT: 694MB - Fast and efficient
- SwinB: 938MB - Higher precision
- Intuitive Parameter Controls
- Threshold: Adjust detection precision
- Mask Blur: Smooth edges
- Mask Offset: Expand or shrink selection
- Background Options: Alpha/Black/White/Green/Blue/Red
- Automatic Model Management
- Auto-download models on first use
- Smart GPU memory handling
-
Tag-Style Prompts
- Single object: "cat"
- Multiple objects: "cat, dog, person"
- With attributes: "red car, blue shirt"
- Format: Use commas to separate multiple objects (e.g., "a, b, c")
-
Natural Language Prompts
- Simple sentence: "a person wearing a red jacket"
- Complex scene: "a woman in a blue dress standing next to a car"
- With location: "a cat sitting on the sofa"
- Format: Write a natural descriptive sentence
-
Tips for Better Results
- For Tag Style:
- Separate objects with commas: "chair, table, lamp"
- Add attributes before objects: "wooden chair, glass table"
- Keep it simple and clear
- For Natural Language:
- Use complete sentences
- Include details like color, position, action
- Be as descriptive as needed
- Parameter Adjustments:
- Threshold: 0.25-0.35 for broad detection, 0.45-0.55 for precision
- Use mask blur for smoother edges
- Adjust mask offset to fine-tune selection
- For Tag Style:
- Changed INSPYRENET model format from .pth to .safetensors for:
- Better security
- Faster loading speed (2-3x faster)
- Improved memory efficiency
- Better cross-platform compatibility
- Simplified node display name for better UI integration
- ANPG (animated PNG), AWEBP (animated WebP) and GIF supported.
APNG.AWEBP.mp4
- Fixed video processing issue
- Enhanced batch processing in RMBG-2.0 model
- Added support for proper batch image handling
- Improved memory efficiency by optimizing image size handling
- Added original size preservation for maintaining aspect ratios
- Implemented proper batch tensor processing
- Improved error handling and code robustness
- Performance gains:
- Single image processing: ~5-10% improvement
- Batch processing: up to 30-50% improvement (depending on batch size and GPU)
- Combined three background removal models into one unified node
- Added support for RMBG-2.0, INSPYRENET, and BEN models
- Implemented lazy loading for models (only downloads when first used)
-
RMBG-2.0 (Homepage)
- Latest version of RMBG model
- Excellent performance on complex backgrounds
- High accuracy in preserving fine details
- Best for general purpose background removal
-
INSPYRENET (Homepage)
- Specialized in human portrait segmentation
- Fast processing speed
- Good edge detection capability
- Ideal for portrait photos and human subjects
-
BEN (Background Elimination Network) (Homepage)
- Robust performance on various image types
- Good balance between speed and accuracy
- Effective on both simple and complex scenes
- Suitable for batch processing
- Unified interface for all three models
- Common parameters for all models:
- Sensitivity adjustment
- Processing resolution control
- Mask blur and offset options
- Multiple background color options
- Invert output option
- Model optimization toggle
- Optimized memory usage with model clearing
- Enhanced error handling and user feedback
- Added detailed tooltips for all parameters
- Improved mask post-processing
- Updated all package dependencies to latest stable versions
- Added support for transparent-background package
- Optimized dependency management
- Added background color options
- Alpha (transparent background)
- Black, White, Green, Blue, Red
- Improved mask processing
- Better detail preservation
- Enhanced edge quality
- More accurate segmentation
- Added video batch processing
- Support for video file background removal
- Maintains original video framerate and resolution
- Multiple output format support (with Alpha channel)
- Efficient batch processing for video frames
-Unsaved-Workflow---ComfyUI.mp4
- Added model cache management
- Cache status checking
- Model memory cleanup
- Better error handling
- Renamed 'invert_mask' to 'invert_output' for clarity
- Added sensitivity adjustment for mask strength
- Updated tooltips for better clarity
- Optimized image processing pipeline
- Added proper model cache verification
- Improved memory management
- Better error handling and recovery
- Enhanced batch processing performance for videos
- Added timm>=0.6.12,<1.0.0 for model support
- Updated requirements.txt with version constraints
- Fixed mask detail preservation issues
- Improved mask edge quality
- Fixed memory leaks in model handling
- The 'Alpha' background option provides transparent background
- Sensitivity parameter now controls mask strength
- Model cache is checked before each operation
- Memory is automatically cleaned when switching models
- Video processing supports various formats and maintains quality




















