- """
- Greedy Search
- """
- import tensorflow as tf
- from transformers import TFGPT2LMHeadModel, GPT2Tokenizer
- tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
- # add the EOS token as PAD token to avoid warnings
- model = TFGPT2LMHeadModel.from_pretrained("gpt2", pad_token_id=tokenizer.eos_token_id)
- # 将句子转化为可用的输入形式
- input_ids = tokenizer.encode('the japanese is so ', return_tensors='tf')
- # 设置生成文本的最大长度未50
- greedy_output = model.generate(input_ids, max_length=50, early_stopping=True)
- """
- 输出内容:
- the japanese is so - that it's not even -
- that it's -that it's - that it's - that it's - that it's
- """
- # 这里使用了5个束来进行搜索
- beam_output = model.generate(input_ids, max_length=50, early_stoppping=True, num_beams=5)
- """
- the japanese is so iced up that I don't even know how to pronounce it.
- I'm not sure how to pronounce it.
- I'm not sure how to pronounce it.
- """
由于以上的得到的结果会出现重复的输出内容,为此我们对其进行设置,使用n-gram penatly来确保没有n-gram在预测的句子中出现两次。但这样也会出现一个问题,那就是在需要重复的场景中,也只能出现一次。
- ngram_beam_output = model.generate(input_ids, max_length=50, early_stoppping=True, num_beams=5,no_repeat_ngram_size=2) # 这里的no_repeat_gram_size表示ngram
- """
- 输出内容:
- the japanese is so iced up that I don't even know how to pronounce it.
- I'm not sure if it's because I'm lazy, or if I just want to be able to say it in Japanese, but I
- """
- ngram_beam_topk_output = model.generate(input_ids, max_length=50, early_stoppping=True, num_beams=5,no_repeat_ngram_size=2, num_return_sequences=5)
- """
- 0: the japanese is so iced up that I don't even know how to pronounce it.
- I'm not sure if it's because I'm lazy, or if I just want to be able to say it in Japanese, but I
- 1: the japanese is so iced up that I don't even know how to pronounce it.
- I'm not sure if it's because I'm lazy, or if I just want to be able to say it in Japanese, but it
- 2: the japanese is so iced up that I don't even know how to pronounce it.
- I'm not sure if it's because I'm lazy, or if I just want to be able to read Japanese, but I think it
- 3: the japanese is so iced up that I don't even know how to pronounce it.
- I'm not sure if it's because I'm lazy, or if I just want to be able to say it in Japanese. But I
- 4: the japanese is so iced up that I don't even know how to pronounce it.
- I'm not sure if it's because I'm lazy, or if I just want to be able to read Japanese, but I think I
- """
- tf.random.set_seed(0)
- # print("Output:\n" + 100 * '-')
- sample_output = model.generate(input_ids, do_sample=True, max_length=200, top_k=50)
- print(tokenizer.decode(sample_output[0], skip_special_tokens=True))
- """
- 输出:
- the japanese is so icky, that even the "Japan's food is the finest" is not really an excuse.
- My point is what it is that does the Japanese taste better than others in this country. The reason is that
- """
- sample_output = model.generate(
- input_ids,
- do_sample=True,
- max_length=50,
- top_k=0,
- temperature=0.9
- )
- print(tokenizer.decode(sample_output[0], skip_special_tokens=True))
- """
- 输出:
- the japanese is so familiar with the Japanese language that it's hard to imagine it being used in a Japanese context.
- The Japanese language is a very complex language, and it's hard to imagine a Japanese person using it in
- """
- sample_output = model.generate(
- input_ids,
- do_sample=True,
- max_length=50,
- top_k=50
- )
- """
- the japanese is so iaa.
- I mean, that's true that people are very hard at finding words in japanese. In fact it might
- be more accurate to say that Japanese means "Japanese", which is what you
- """
- # deactivate top_k sampling and sample only from 92% most likely words
- sample_output = model.generate(
- input_ids,
- do_sample=True,
- max_length=50,
- top_p=0.92,
- top_k=0
- )
- """
- the japanese is so!!!! Cuz of that, you forgot to put caps properly!!!!
- I'll make this shit up for those who can't usually talk with one who can PLEASE READ THE INPICIOUS and SATURN INVINC
- """
- # set top_k = 50 and set top_p = 0.95 and num_return_sequences = 3
- sample_outputs = model.generate(
- input_ids,
- do_sample=True,
- max_length=50,
- top_k=50,
- top_p=0.95,
- num_return_sequences=3
- )
- """
- the japanese is so icky in mine. it's a lot like someone's salivating over peas and chocolate.
- rxtfffffff.......ooh they need Japanese drinking?? Just a thought....... Join us and talk to us about all kinds
- """