ValueError: from_string: error parsing grammar file: parsed_grammar.rules is empty

### Discussed in https://github.com/abetlen/llama-cpp-python/discussions/614

<div type='discussions-op-text'>

<sup>Originally posted by **talhalatifkhan** August 16, 2023</sup>
I am trying to make sure that my output follow a json format every time, i stumbled upon jsonformer and from there i stumbled upon grammar-based sampling, I used [json-schema-to-grammar.py](https://github.com/ggerganov/llama.cpp/blob/master/examples/json-schema-to-grammar.py) to convert json schema. 

I want to know if grammar based sampling is used for this specific purpose and if so then how do i use it.

Json schema 
```
json_schema = {
    "type": "object",
    "properties": {
        "Stage": {
            "type": "string",
            "enum": ["first", "second"]
        },
        "Task Finished": {"type": "boolean"},
        "Statement": {"type": "string"},
        "Assistant": {"type": "string"}
    }
}
```
Llama grammar
```
space ::= " "?
string ::=  "\"" (
        [^"\\] |
        "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
      )* "\"" space 
Stage ::= "\"first\"" | "\"second\""
boolean ::= ("true" | "false") space
root ::= "{" space "\"Assistant\"" space ":" space string "," space "\"Stage\"" space ":" space Stage "," space "\"Statement\"" space ":" space string "," space "\"Task Finished\"" space ":" space boolean "}" space
```


Here is my code

```
from llama_cpp import Llama, LlamaGrammar

fs_template = """
You are a precise AI comparer. Your task is to match the user's intent to the statements in the context and confirm if the identified intent is correct.
Your responses should strictly follow the format below:
    Stage: [print 'first']
    User Intent: [insert user intent statement here]
    Task Finished: [insert boolean value based on whether user intent is confirmed]
    Assistant: [inser Assistant response here ]


Adhere to the following instructions to complete the task:
1. Start by trying to match the user's question to the statements in the context.
2. If you identify the matching statement to the user's question then confirm it from the user.
3. If the user's intent is unclear or doesn't match the context, ask follow-up questions by providing the options in the context.
4. Once you have confirmed the user intent, set "Task Finished: True" and proceed with your response.
5. You will fail your task if the output generated does not follow the format mentioned above.

Context: (only knowledge base you have)
------------
sample context
-----------
"""

schema = '''
space ::= " "?
string ::=  "\"" (
        [^"\\] |
        "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
      )* "\"" space 
Stage ::= "\"first\"" | "\"second\""
boolean ::= ("true" | "false") space
root ::= "{" space "\"Assistant\"" space ":" space string "," space "\"Stage\"" space ":" space Stage "," space "\"Statement\"" space ":" space string "," space "\"Task Finished\"" space ":" space boolean "}" space
'''


def get_prompt(question: str, chat_history: list,
               system_prompt: str) -> str:
    texts = [f'[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n']
    for user_input, response in chat_history:
        texts.append(f'{user_input.strip()} [/INST] {response.strip()} </s><s> [INST] ')
    texts.append(f'{question.strip()} [/INST]')
    return ''.join(texts)


history = []
prompt = get_prompt("user query", history, fs_template)

grammar = LlamaGrammar.from_string(grammar=schema, verbose=True)
print(grammar)
client = Llama(
    model_path="model/llama-2-13b-chat.ggmlv3.q8_0.bin",
    n_ctx=4098,
    n_threads=16,
    last_n_tokens_size=70,
)

answer = client(
    prompt,
    grammar=grammar,
    stream=False,
    temperature=0.0,
    top_p=0.95,
    top_k=50,
    repeat_penalty=1.3,
    max_tokens=4000,
)
print(answer)
```

This is the error i am getting
```
parse: error parsing grammar: expecting newline or end at \] |
        "\" (["\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
      )* """ space 
Stage ::= ""first"" | ""second""
boolean ::= ("true" | "false") space
root ::= "{" space ""Assistant"" space ":" space string "," space ""Stage"" space ":" space Stage "," space ""Statement"" space ":" space string "," space ""Task Finished"" space ":" space boolean "}" space

Traceback (most recent call last):
  File "/home/talha/CloudWhisper/jformer.py", line 49, in <module>
    grammar = LlamaGrammar.from_string(grammar=schema,verbose=True)
  File "/home/talha/.local/lib/python3.10/site-packages/llama_cpp/llama_grammar.py", line 66, in from_string
    raise ValueError(
ValueError: from_string: error parsing grammar file: parsed_grammar.rules is empty
```
</div>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: from_string: error parsing grammar file: parsed_grammar.rules is empty #615

Discussed in #614

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

ValueError: from_string: error parsing grammar file: parsed_grammar.rules is empty #615

Description

Discussed in #614

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions