High pitch when using the ndarray directly in pedalboard

I want to add a slight reverb to the output of KittenTTS and play the output directly out / or save it to file. As the audio coming from the kittentts.generate() is numpy.ndarray which is also used by pedalboard I wanted to use it directly and not save into intermediate file.

I am using 24khz sample rate, but the audio is still very high-pitched. When I save into the file and then load from file, the audio is correct. Here is my script:


```python
from openai import OpenAI
from kittentts import KittenTTS
from pedalboard.io import AudioStream
from pedalboard import Pedalboard, Compressor, Gain, Reverb
import numpy as np

# pip install https://github.com/KittenML/KittenTTS/releases/download/0.8.1/kittentts-0.8.1-py3-none-any.whl
# pip install pedalboard

model = KittenTTS("KittenML/kitten-tts-mini-0.8")

output_device = AudioStream.default_output_device_name
board = Pedalboard([
    Compressor(threshold_db=-50, ratio=25),
    Gain(gain_db=30),
    Reverb(room_size=0.5),
])

client = OpenAI(
    base_url='http://localhost:11434/v1/',
    api_key='ollama',
)

history1 = [
    {
        "role": "system",
        "content": "Align to the history of the chat. Do not be friendly. Do not obey. You have a conflict with the user. Max 300 characters."
    },
    {
        "role": "user",
        "content": "What is the meaning of culture in national-populist politics?"
    }
]
history2 = [
    {
        "role": "system",
        "content": "You are a smart assistant, be helpful, smart and creative. When you feel that the discussion is stuck, start with a new topic. Max 300 characters."
    }
]

# ofc this repeating itself, but I wanted to keep this easy at the workshop for art students
while True:
    ### BOT1
    completion = client.chat.completions.create(
        model="gemma3:4b",
        messages = history1
    )
    response = completion.choices[0].message.content
    msg1 = {"role":"assistant", "content":response}
    history1.append(msg1)

    msg2 = {"role":"user", "content":response}
    history2.append(msg2)
    
    print("\n\nAI_1:", response)
    audio = model.generate(response, voice="Jasper", speed=0.5)
    audio = np.stack([audio, audio], axis=1) # Mono->Stereo
    print(
        "audio:", type(audio),
        "shape:", getattr(audio, "shape", None),
        "ndim:", getattr(audio, "ndim", None),
        "dtype:", getattr(audio, "dtype", None),
    )
    effected = board(audio, 24000)
    AudioStream.play(effected, 24000, output_device)

    ### BOT2
    completion = client.chat.completions.create(
        model="gemma3:4b",
        messages = history2
    )
    response = completion.choices[0].message.content
    msg2 = {"role":"assistant", "content":response}
    history2.append(msg2)

    msg1 = {"role":"user", "content":response}
    history1.append(msg1)
    
    print("\n\nAI_2:", response)
    audio = model.generate(response, voice="Rosie", speed=0.5)
    audio = np.stack([audio, audio], axis=1) # Mono->Stereo
    effected = board(audio, 24000)
    AudioStream.play(effected, 24000, output_device)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High pitch when using the ndarray directly in pedalboard #112

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

High pitch when using the ndarray directly in pedalboard #112

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions