Create 4chan posts with images.
-
Clone the repository:
git clone https://github.com/RobertoNeglia/PePeAI.git cd PePeAI -
Create a virtual environment:
conda env create -f environment.yml conda activate genai git clone https://github.com/huggingface/diffusers cd diffusers pip install .
If you are running into environment issues, try running terminal commands:
conda create -n genai python=3.10 -y conda activate genai pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 pip install transformers peft accelerate safetensors git clone https://github.com/huggingface/diffusers cd diffusers pip install .
- Run GUI.py.
cd scripts python GUI.py - Write a subject in textbox.
- Click on the generate button.
-
Download the dataset from Kaggle.(not really needed - the dataset is on Hugging Face) -
Export environment variables:
export MODEL_NAME="stabilityai/stable-diffusion-2-base" # or "runwayml/stable-diffusion-v1-5" for SD1.5 export DATASET_NAME="RobertoNeglia/pepe_dataset_sentiment" export OUTPUT_DIR="/path/to/output/dir" # e.g. "/home/user/pepe_lora" export HUB_MODEL_ID="yourusername/model_name" # e.g. "RobertoNeglia/pepe_generator"
Replace
/path/to/output/dirwith the path where you want to save the model andyourusername/model_namewith your Hugging Face username and desired model name. -
Login to Hugging Face Hub:
huggingface-cli login
Follow the instructions to log in to your Hugging Face account.
If you want to push the model to the Hub, make sure you have the right permissions to do so, or generate a new token. -
Login to WandB (for logging):
wandb login
Follow the instructions to log in to your WandB account.
-
Run the training script from the diffusers library:
accelerate launch train_text_to_image_lora.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --dataset_name=$DATASET_NAME \ --dataloader_num_workers=8 \ --resolution=512 \ --center_crop \ --random_flip \ --train_batch_size=24 \ # Adjust based on your GPU memory --gradient_accumulation_steps=4 \ --max_train_steps=4300 \ # Number of training steps --learning_rate=1e-04 \ # Learning rate for the optimizer --max_grad_norm=1 \ # Maximum gradient norm --lr_scheduler="cosine" \ # Learning rate scheduler --lr_warmup_steps=0 \ --output_dir=${OUTPUT_DIR} \ --push_to_hub \ # Push the model to the Hugging Face Hub --hub_model_id=${HUB_MODEL_ID} \ --report_to=wandb \ # Log to WandB --checkpointing_steps=500 \ # Save checkpoints every 500 steps --validation_prompt="pepe the frog, happiness" \ # Prompt for validation --seed=42 \ --caption_column="features" # Column in the dataset containing the captions --validation_epochs=50 # Validate every 50 epochs --rank=16 # Rank for LoRA
-
Look here for more information on training the LoRA model.
from diffusers import StableDiffusionPipeline
import torch
import matplotlib.pyplot as plt
model_path = "RobertoNeglia/pepe_generator_sd2base_sentiment"
pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-base", torch_dtype=torch.float16)
pipe.unet.load_attn_procs(model_path)
pipe.to("cuda")
prompt = "pepe the frog, sad, crying, digital art, high quality"
image = pipe(prompt, num_inference_steps=50, guidance_scale=7.5).images[0]
plt.imshow(image)
plt.axis("off")
plt.show()-
Download the LLM data to the clone's driectory.
-
Run termial commands in clone's driectory:
unzstd pol_0616-1119_labeled.tar.zst tar -xvf pol_0616-1119_labeled.tar
-
Run LLM_training.py.
- Open LLM_test.py.
- Insert a subject in the name==main block for the prompt "Generate a /pol/ style opening post about: {subject}".
if __name__ == "__main__": # Test the generator with sample topics topic = "cats" text = gen_OP(topic, max_length=150) print(text)
- Run LLM_test.py
cd scripts python LLM_test.py