This paper proposes Caption Booru, an open, privacy-aware platform for collecting, curating, and evaluating image captions at scale. Caption Booru combines moderated community contribution, automated captioning models, and structured metadata to create a searchable dataset for research and application in multimodal AI. We present system design, dataset schema, moderation policy, model-in-the-loop curation, evaluation methodology, and initial experimental results.
FluX LoRAs: Is natural language caption much better than booru tags Caption Booru
: These sites use a collaborative "folksonomy" tagging system. Instead of folders, you search for images using specific combinations of tags like characters, artists, or specific themes. Safety and Filtering Tagging permanence: No more losing a story because
Granular Control: Tags allow you to specify exact details—such as camera angles, lighting, and specific character traits—without the "noise" of complex grammar. Granular Control : Tags allow you to specify