The output is the same as the input.

I have a strange problem.  I want to use use the ``batch size`` data for unified prediction. So I first use the ``tokenizer.encode()`` to encode the data. and I also use the ``padding`` to control the uniform length. The specific code is shown below:
```
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
model.resize_token_embeddings(len(tokenizer))
code = '''Judge whether there is a defect in ["int ff_get_wav_header"]. Respond with 'YES' or 'NO' only.'''
system_prompt = """# StableLM Tuned (Alpha version)
- StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.
- StableLM will refuse to participate in anything that could harm a human.
"""
prompt = f"<|SYSTEM|>{system_prompt}<|USER|>{code}<|ASSISTANT|>"
source_ids = tokenizer.encode(prompt, max_length=1500, padding='max_length', return_tensors='pt').to("cuda")
source_mask = source_ids.ne(tokenizer.pad_token_id).long()
type_ids = torch.zeros(source_ids.shape, dtype=torch.long).to("cuda")
print(source_mask.shape, '---', source_ids.shape)
preds = model.generate(
        source_ids, attention_mask = source_mask,
        max_new_tokens=512,
        temperature=0.2,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id,
        stopping_criteria=StoppingCriteriaList([StopOnTokens()])
      )
top_preds = list(preds.cpu().numpy())
pred_nls = [tokenizer.decode(id, skip_special_tokens=True, clean_up_tokenization_spaces=False) for id in top_preds]
print(pred_nls)
```
However, the output is the same as the prompt, as follows:
```
['# StableLM Tuned (Alpha version)\n- StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.\n- StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.\n- StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.\n- StableLM will refuse to participate in anything that could harm a human.\nJudge whether there is a defect in ["int ff_get_wav_header"]. Respond with \'YES\' or \'NO\' only.']
```
****************************************************************************
If I do the conversion without ``tokenizer.encode`` and direct use the ``tokenizer`` to make predict,  I get the normal output, which the code looks like this:
```
code = '''Judge whether there is a defect in ["int ff_get_wav_header"]. Respond with 'YES' or 'NO' only.'''
system_prompt = """# StableLM Tuned (Alpha version)
- StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.
- StableLM will refuse to participate in anything that could harm a human.
"""
prompt = f"<|SYSTEM|>{system_prompt}<|USER|>{code}<|ASSISTANT|>"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
tokens = model.generate(
        **inputs,
        max_new_tokens=max_new_tokens,
        temperature=0.2,
        do_sample=True,
        stopping_criteria=StoppingCriteriaList([StopOnTokens()])
      )
print(tokenizer.decode(tokens[0], skip_special_tokens=True))```
```
The  output is as follows:
```
# StableLM Tuned (Alpha version)
- StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.
- StableLM will refuse to participate in anything that could harm a human.
Judge whether there is a defect in ["int ff_get_wav_header"]. Respond with 'YES' or 'NO' only.**Yes**.
```
Why does this happen? Is there something important I'm missing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The output is the same as the input. #81

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The output is the same as the input. #81

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions