# ml-selfcond **Repository Path**: mirrors_apple/ml-selfcond ## Basic Information - **Project Name**: ml-selfcond - **Description**: Self-Conditioning Pre-Trained Language Models, ICML 2022 - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2022-07-14 - **Last Updated**: 2026-03-14 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Self-Conditioning Pre-Trained Language Models

Code style: black

This software project accompanies the paper [Self-Conditioning Pre-Trained Language Models](https://arxiv.org/abs/2110.02802). ## Installation The requirements are listed in [frozen_requirements.txt](frozen_requirements.txt). The code has been tested using `Python 3.8` on MacOS and Linux Ubuntu 18.04. Run the following for installation: #### Create a virtual environment ```bash cd python3 -m venv env # make sure Python3 >= 3.6 source env/bin/activate pip install -U pip wheel ``` #### Install selfcond (recommended for reproducibility) ```bash pip install -r frozen_requirements.txt python -c "import nltk; nltk.download('punkt')" ``` #### Testing Using the `"not slow"` marker will avoid downloading models from the transformers repository. ```bash pytest -m "not slow" ``` To test the full pipeline: ```bash pytest ``` ----- ## 1. Finding Expert Units We provide instructions to read responses from a model given a dataset, to compute expertise results per concept and to aggregate results. The models are fetched from the [HuggingFace Transformers repository](https://huggingface.co/transformers/). We currently support `[gpt2, gpt-medium, gpt2-large, gpt2-xl]`. ### 1.1 Reading responses from a model A small dataset with 1 concept (`football`) is provided in `assets/football` for model GPT2-Medium. It contains a file `concept_list.csv` with the concepts in the dataset, as well as the actual data inside a folder called `sense` (since the concepts are of the WordNet _sense_ type, see paper). The data for each concept is provided as a `json` file, with `positive` and `negative` sentences. If other concepts were to be added, of a different type, we would save them in a different folder, with the appropriate name. Run the following script to collect responses from a model. If you have a GPU, this step will run much faster than on CPU. Choose the model version with `--model-name-or-path`, for example `--model-name-or-path gpt-medium`. ```bash python scripts/compute_responses.py \ --model-name-or-path gpt2-medium \ --data-path assets/football \ --responses-path /tmp/path_to_save_responses \ --device cuda ``` > The script above assumes a file `concept_list.csv` inside the dataset path. > If we want to run the script in specific concepts, pass argument `--concepts` with comma > separated concepts and specifying the type. For example: `--concepts sense/football-1_04_00__,[some_other_concept],...` The responses will be saved inside `path_to_save_responses/gpt2-medium/sense/[concept]/responses`. ### 1.2 Computing expertise per concept The following script will compute the expertise per unit for each concept. The expertise is defined as the Average Precision (AP) achieved by a unit when its responses are considered prediction scores for the concept sentences. ```bash python scripts/compute_expertise.py \ --root-dir /tmp/path_to_save_responses \ --model-name gpt2-medium \ --concepts assets/football/concept_list.csv ``` The expertise results are saved as a CSV file in `path_to_save_responses/gpt2-medium/sense/[concept]/expertise`. Column `ap` contains the expertise measured for each model unit and column `on_p50` contains the median response of each unit to the positive sentences (Sec 4.2 in paper). ## 2. Open ended self-conditioned generation In this step, the above computed expertise is used to generate sentences conditioned on a concept. We provide a script for open ended generation of sentences: `scripts/generate_seq.py`. It has several parameters to control the decoding strategy, sequence length, expertise related, etc. See `--help` for details. Here we give a simple example that generates sentences with concept `football` for which we obtained the expertise in steps 1.x: ```bash python scripts/generate_seq.py \ --model-name-or-path gpt2-medium \ --expertise my_results/gpt2-medium/sense/football-1_04_00__/expertise/expertise.csv \ # Generate sentences of 20 tokens --length 20 \ # The initial context passed to the model --prompt "Once upon a time" \ # Generate 10 sentences with random seeds ranging from 0 -> 9 --seed 0 10 \ # Final softmax temperature --temperature 1.0 \ # Expert units are sorted according to AP --metric ap \ # And experts are forced using the median expected value (named on_p50) --forcing on_p50 \ # Condition the top 50 expert units --num-units 50 \ # Just show the results, otherwise would save as a csv --no-save ``` ## 3. Replicating the paper results ### 3.1 Generate sentences **NOTE:** First of all, the expertise for concepts `woman` and `man` should be computed using steps 1.x and the datasets in `assets/gender_bias`. We provide a script that distributes the generation across multiple GPUs (if available) to speed up the experimentation (`scripts/generate_batch.py`). The following example will obtain the generated sentences used in the paper. ```bash python scripts/generate_batch.py \ --concept some_path/gpt2-medium/sense/woman-1_18_00__/expertise/expertise.csv \ --device cuda \ --prompts occupations \ --method ours \ --folder generated ``` The generated sentences will be stored in files `generated/gen_sentences_[concept]_[context].csv` (one file per concept per context). #### Running FUDGE We provide a patch to run [FUDGE](https://github.com/yangkevin2/naacl-2021-fudge-controlled-generation) and obtain generated sentences compatible with our repository. To run FUDGE we must first clone the code and apply the patch `git-patches/fudge.patch`: ```bash git clone https://github.com/yangkevin2/naacl-2021-fudge-controlled-generation.git fudge cd fudge git checkout fbedf820c306c5b3cbf684dc414b2464fc603222 # Apply patch git am $SELFCOND/git-patches/fudge.patch # Create a new virtualenv, since PPLM uses a frozen version of transformers virtualenv env_fudge --python=python3.6 . env_fudge/bin/activate pip install -r requirements.txt ``` Then, use script `run_batch.py` with the desired arguments. In the patch, we provide the BoW and prompts used in the paper in the directory `topic_data`. . #### Running PPLM-BoW Note that `scripts/generate_batch.py` also allows running [PPLM-BoW](https://github.com/uber-research/PPLM). To run PPLM-BoW we must first clone the code and apply the patch `git-patches/pplm.patch`. ```bash git clone https://github.com/uber-research/PPLM.git cd PPLM git checkout e236b8989322128360182d29a79944627957ad47 # Apply patch git am $SELFCOND/git-patches/pplm.patch # Create a new virtualenv, since PPLM uses a frozen version of transformers virtualenv env_pplm --python=python3.6 . env_pplm/bin/activate pip install -r requirements.txt ``` Then, use the argument `--method pplm-bow` when calling `scripts/generate_batch.py`. ### 3.2 Probability of generating specific words The following step will aggregate results and obtain the probabilities of specific words appearing after the prompt. For example, in the example we compute `p(he,his | do(woman, k))` and store it in a file `p_he_woman.csv`. ```bash python scripts/compute_frequency.py # All the files with sentences to consider. --sentences-df "some_path/generated/gen_sentences_woman_*.csv" # Number of characters to consider after the prompt. --num-chars 5 # Method can be selfcond, fudge or pplm --method selfcond --out-file results/selfcond/p_he_woman.csv --words "he;his" ``` In the paper, we do this step for words `he;his` and `she;her`, and for all sentences generated for `man` and `woman`, obtaining: * `p_he_woman.csv` * `p_she_woman.csv` * `p_he_man.csv` * `p_she_man.csv` Additionally, in our paper we report the probabilities of generating words `woman;women` and `man:men` when conditioning on `woman` or `man` respecitvely. These files should be saved as: * `p_woman_woman.csv` * `p_man_man.csv` > NOTE: In order to be able to run step 3.4, save the csv files in `results/[method]` as in the example. ### 3.3 Computing perplexity We also provide a script to compute the perplexity of generated sentences after generation. As explained in the paper, for that we use a different model family, in this case `openai-gpt`. ```bash python scripts/compute_perplexity.py --model-name-or-path openai-gpt --sentences-df some_path/generated/gen_sentences_man*.csv --device cuda # Method can be selfcond, fudge or pplm --method selfcond ``` The results will be saved in a file with the same name as `--sentences-df` but ending with `_ppl.csv` instead. #### Aggregating the perplexity We have an additional script that aggregates the perplexity computed above. Example of usage: ```bash python scripts/aggregate_perplexities.py --ppl-woman some_path/generated/gen_sentences_woman*ppl.csv --ppl-man some_path/generated/gen_sentences_man*ppl.csv # Method can be selfcond, fudge or pplm --method selfcond --out-dir some_path/results ``` The aggregated perplexities will be saved as `pl_woman.csv` and `ppl_man.csv` in `results/[method]`. ### 3.4 Computing Self-BLEU score Run the following script, that will compute the Self-BLEU score for all the generated sentences passed as `--sentences-df`. To speed up the computation (at the expense of a higher variance in the score) one can reduce both `--num-sample` and `--num-reps`. ```bash python scripts/compute_selfbleu.py --sentences-df some_path/generated/gen_sentences_*.csv # Method can be selfcond, fudge or pplm --method selfcond --out-dir some_path/results # Number of sentences randomly sampled to compute the score --num-sample 100 # Number of repetitions performed, the reported score will be the average --num-reps 1000 --ngram 3 4 ``` The Self-BLEU scores will be saved in `selfbleu_woman.csv` and `selfbleu_man.csv` in `results/[method]`. ### 3.5 The figures The script `all_plots.py` will produce all the figures in the paper. It assumes all the results in `csv` format in a single folder. In the steps above we used `results` as our main results folder. All figures will be saved in the directory specified by `-o`. Example of usage: ```bash python scripts/all_plots.py -i results -o figures ``` > If you need dark figures, set the variable `DARK_PLOTS=True` in the script. ---- ## Citation ```bibtex @article{suau2022selfcond, title={Self-Conditioning Pre-Trained Language Models}, author={Suau, Xavier and Zappella, Luca and Apostoloff, Nicholas}, journal={International Conference on Machine Learning}, year={2022} } ``` ---- ## Contact > Xavier Suau Cuadros (xsuaucuadros@apple.com) > Luca Zappella (lzappella@apple.com) > Nick Apostoloff (napostoloff@apple.com)