# collaborative-rnn **Repository Path**: hjz_666/collaborative-rnn ## Basic Information - **Project Name**: collaborative-rnn - **Description**: A TensorFlow implementation of the collaborative RNN (Ko et al, 2016). - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 0 - **Created**: 2020-03-12 - **Last Updated**: 2024-06-16 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Collaborative RNN This is a TensorFlow implementation of the Collaborative RNN presented in the paper > **Collaborative Recurrent Neural Networks for Dynamic Recommender Systems**, > Young-Jun Ko, Lucas Maystre, Matthias Grossglauser, ACML, 2016. A PDF of the paper can be found [here](https://infoscience.epfl.ch/record/222477/files/ko101.pdf). ## Requirements The code is tested with - Python 2.7.12 and 3.5.1 - NumPy 1.13.3 - TensorFlow 1.4.0 - CUDA 8.0 - cuDNN 6.0 - six 1.11.0 If you are interested in quickly testing out our code, you might want to **check out our [step-by-step guide][1]** for running the collaborative RNN on an AWS EC2 p2.xlarge instance. ## Quickstart Reproducing the results of the paper should be as easy as following these three steps. 1. Download the datasets. - The last.fm dataset is available on [Òscar Celma's page][2]. The relevant file is `userid-timestamp-artid-artname-traid-traname.tsv`. - The BrighKite dataset is available at [SNAP][3]. The relevant file is `loc-brightkite_totalCheckins.txt`. 2. Preprocess the data (relabel user and items, remove degenerate cases, split into training and validation sets). This can be done using the script `utils/preprocess.py`. For example, for BrightKite: python utils/preprocess.py brightkite path/to/raw_file.txt This will create two files named `brightkite-train.txt` and `brightkite-valid.txt`. 3. Run `crnn.py` on the preprocessed data. For example for BrightKite, you might want to try running python -u crnn.py brightkite-{train,valid}.txt --hidden-size=32 \ --learning-rate=0.0075 --rho=0.997 \ --chunk-size=64 --batch-size=20 --num-epochs=25 Here is a table that summarizes the settings that gave us the results published in the paper. All the setting can be passed as command line arguments to `crnn.py`. | Argument | BrightKite | last.fm | | -------------------- | ---------- | ------- | | `--batch-size` | 20 | 20 | | `--chunk-size` | 64 | 64 | | `--hidden-size` | 32 | 128 | | `--learning-rate` | 0.0075 | 0.01 | | `--max-train-chunks` | *(None)* | 80 | | `--max-valid-chunks` | *(None)* | 8 | | `--num-epochs` | 25 | 10 | | `--rho` | 0.997 | 0.997 | On a modern server with an Nvidia Titan X (Maxwell generation) GPU it takes around 40 seconds per epoch for the BrightKite dataset, and around 14 minutes per epoch on the last.fm dataset. [1]: docs/running-on-aws.md [2]: http://www.dtic.upf.edu/~ocelma/MusicRecommendationDataset/lastfm-1K.html [3]: https://snap.stanford.edu/data/loc-brightkite.html