# EasyTransformer
**Repository Path**: kevinpo/EasyTransformer
## Basic Information
- **Project Name**: EasyTransformer
- **Description**: Quick start with strong baseline of Bert and Transformer without pretrain
- **Primary Language**: Python
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-04-02
- **Last Updated**: 2021-04-02
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# EasyTransformer
Simple implement of Transformer Baseline and BERT **extracted from other repo**
Both Model are unpretrained and random initialize
BERT from https://github.com/ne7ermore/torch-light
Transformer from https://github.com/whr94621/NJUNMT-pytorch
Create this repo to provide simple and easy quick start of using **Transformer** and **Bert** as baseline
you can have a quick start with
```
pip install EasyTransformer
# new version is 0.0.4
```
Here is a simple demo(Also available in demo.py)
```Python
import EasyTransformer
from EasyTransformer import bert
from EasyTransformer import transformer
import torch
lines =[
"I love NJU",
"Good morning"
]
Encoder = bert.BERT()
tokenizer=bert.BertTokenizer(lines,30000,512)
text = []
position = []
segment=[]
indexed_tokens, pos, segment_label = tokenizer.encodepro(lines[0])
text.append(indexed_tokens)
position.append(pos)
segment.append(segment_label)
indexed_tokens, pos, segment_label = tokenizer.encodepro(lines[1])
text.append(indexed_tokens)
position.append(pos)
segment.append(segment_label)
text= torch.tensor(text)
position = torch.tensor(position)
segment = torch.tensor(segment)
out1,out2 = Encoder(text,position,segment)
print(out1.shape)
print(out2.shape)
Encoder = transformer.TransformerEncoder(30000)
tokenizer= transformer.TransformerTokenizer(30000,512,lines)
text = []
indexed_tokens= tokenizer.encode(lines[0])
text.append(indexed_tokens)
indexed_tokens= tokenizer.encode(lines[1])
text.append(indexed_tokens)
text= torch.tensor(text)
out1,out2 = Encoder(text)
print(out1.shape)
print(out2.shape)
```
Here are some parameter you can choose to get your own Model as you like
**TransformerTokenizer**
```
def __init__(self, max_wordn,max_length, lines)
```
**TransformerEncoder**
```
def __init__(
self, n_src_vocab, n_layers=6, n_head=8,
d_word_vec=512, d_model=512, d_inner_hid=1024, dropout=0.1, dim_per_head=None)
```
**BertTokenizer**
```
def __init__(self,lines,max_wordn,max_len)
```
**BERT**
```
def __init__(self, vacabulary_size=30000,d_model=768,dropout=0.1,max_len=512,n_stack_layers=12,d_ff=3072,n_head=12):
```