Examples
How to run
To run the examples, save them as Python files (e.g. workflow.py
) and run them using:
beprepared run workflow.py
You will almost always use a Load
node to import images and a Save
node to write out the cleaned data set. What comes in
between can be simple or complex.
When you invoke a Human*
node like HumanFilter
or HumanTag
, beprepared will launch a web-based interface on port 8989.
If there are un-filtered or un-tagged images, you will be prompted to go to the web interface and perform filtering.
By convention, Save
will place output images in the output/
directory. For each image, there will be a companion .txt
file
containing that image's caption. There will also be a .json
file which contains all of the image's metadata. Finally, there is an
index.html
which allows you to view the images and their metadata in a web browser.
Captioning with a trigger word
This is a simple example of how to use beprepared to caption images based on a trigger word.
(
Load("/path/to/photos_of_me")
>> FilterBySize(min_edge=512)
>> ConvertFormat("JPEG")
>> Dedupe
>> SetCaption("ohwx person")
>> Save
)
Auto-captioning using JoyCaption
(
Load("/path/to/photos_of_me")
>> FilterBySize(min_edge=512)
>> ConvertFormat("JPEG")
>> JoyCaptionAlphaOne
>> Save
)
Fuzzy deduplication
This example shows how to use FuzzyDedupe to remove duplicate images based on CLIP embeddings.
(
Load("/path/to/photos_of_me")
>> ClipEmbed
>> FuzzyDedupe
>> Save
)
Filtering based on Aesthetic Score
This example shows how to use select the best 100 albums based on their aesthetic score.
(
Load("/path/to/photos_of_me")
>> AestheticScore
>> Sorted(lambda image: image.aesthetic_score.value, reverse=True)
>> Take(100)
>> Save
)
Filtering out NSFW content
This example shows how to filter NSFW content using NudeNet
(
Load("/path/to/images")
>> NudeNet
>> Filter(lambda image: not image.has_nudity)
>> Save
)
Filter images manually then caption with GPT4o
To run this example, you will need to set OPENAI_API_KEY
in your environment.
(
Load("/path/to/photos_of_me")
>> FilterBySize(min_edge=512)
>> ConvertFormat("JPEG")
>> HumanFilter
>> GPT4oCaption
>> Save
)
Manually tag images
(
Load("/path/to/photos_of_dogs")
>> FilterBySize(min_edge=512)
>> ConvertFormat("JPEG")
>> HumanFilter
>> HumanTag(tags=["labrador", "golden retriever", "poodle"])
>> Apply(lambda image: image.caption.value = ', '.join(['dog'] + image.tags.value))
>> Save
)
Captioning a mix of SFW and NSFW content using various VLMs
Some VLMs are more NSFW-friendly than others. This workflow shows how to split the workflow and use different caption strategies for NSFW content.
all = (
Load("/path/to/images")
>> FilterBySize(min_edge=512)
>> ConvertFormat("JPEG")
>> NudeNet
)
with_nudity = all >> Filter(lambda image: image.has_nudity) >> JoyCaptionAlphaOne
without_nudity = all >> Filter(lambda image: not image.has_nudity) >> GPT4oCaption
Concat(with_nudity, without_nudity) >> Save
Captioning using multiple VLMs
This workflow shows how to use multiple VLMs to caption an image, and then combine the results into a single caption using LLMCaptionTransform
.
To run this example, you will need to set OPENAI_API_KEY
and TOGETHER_AI_API_KEY
in your environment.
LLM APIs are accessed using litellm, and any model string supported by litellm
should work here.
(
Load("/path/to/images")
>> FilterBySize(min_edge=512)
>> ConvertFormat("JPEG")
>> JoyCaptionAlphaOne(target_prop='joycaption')
>> GPT4oCaption(target_prop='gpt4ocaption')
>> XGenMMCaption(target_prop='xgenmmcaption')
>> QwenVLCaption(target_prop='qwenvlcaption')
>> LlamaCaption(target_prop='llamacaption')
>> LLMCaptionTransform('together_ai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo',
lambda image: f"""
Multiple VLMs have captioned this image. These are their results:
- JoyCaption: {image.joycaption.value}
- GPT4oCaption: {image.gpt4ocaption.value}
- XGenMMCaption: {image.xgenmmcaption.value}
- QwenVLCaption: {image.qwenvlcaption.value}
- LlamaCaption: {image.llamacaption.value}
Please generate a final caption for this image based on the above information. Your response should be the caption, with no extra text or boilerplate. """.strip(), target_prop='caption') >> Save )