3.7.Frequently Asked Questions
Q1: How to modify wake-up words?
A: Modify in tk_audio_process.py:
self.wake_up_words = ["TienKung", "sky", "space"] # Change to your wake-up words
Q2: How to increase speech response speed?
A: Reduce LLM model or use faster model:
# In llm_client.py
self.model = "qwen2.5:0.5b" # Change to smaller model, but logic generation quality may need verification
Q3: How to adjust TTS speech rate?
A: Adjust in piper_provider.py:
self.piper_syn_config = SynthesisConfig(
length_scale=0.8, # < 1.0 faster, > 1.0 slower
...
)
Q4: Does it support multiple languages?
A: Need to download Piper models for corresponding languages:
# Download language-specific voice models
https://huggingface.co/rhasspy/piper-voices/tree/main
Then modify model path in piper_provider.py.
B: The Funasr on x86 also needs redeployment with models supporting other languages, reference: https://github.com/modelscope/FunASR
Q5: How to implement multi-turn conversation memory?
A: Adjust history length in llm_client.py:
self.history = deque(maxlen=10) # Keep latest 5 turns (10 messages)
Q6: Can other Ollama models be used offline?
A: Yes, but Ollama models must be downloaded in advance:
ollama pull qwen2.5:1.5b
Then modify the model specification when calling llm_client.
Q7: Error when calling funasr_client for speech recognition?
Go to x86 machine and check if funasr service Docker container is running:
docker ps|grep asr
Q8: How to handle unstable network?
A: Add retry mechanism:
for attempt in range(3):
try:
result = self.asr_service.to_text(audio_bytes)
return result
except Exception as e:
if attempt < 2:
time.sleep(2 ** attempt) # Exponential backoff
else:
raise