Chatbot with Voice Clone

# The Heart Space — Chatbot with Cloned Voice: Full Documentation

## Overview

"The Heart Space" is an AI-powered trauma recovery chatbot embedded in the Trauma Navigator platform. It combines:

- **OpenAI GPT-4o** for empathetic, trauma-informed text responses

- **HeyGen Voice Cloning** (primary) to speak responses in a cloned human voice (Felicia)

- **OpenAI TTS** (fallback) using the `nova` voice model

- **RAG (Retrieval-Augmented Generation)** to ground responses in the toolkit's exercise library

- **Persistent user accounts** with full conversation history

---

## API Dependencies

| Service | Purpose | API Key Environment Variable |

|---|---|---|

| OpenAI (GPT-4o) | Chat completion (AI responses) | `OPENAI_API_KEY` |

| OpenAI (text-embedding-3-small) | RAG vector embeddings | `OPENAI_API_KEY` |

| OpenAI (tts-1-hd, nova) | Fallback TTS voice | `OPENAI_API_KEY` |

| HeyGen v3 Voices API | Primary cloned voice synthesis | `HEYGEN_API_KEY` |

| ElevenLabs (inactive) | Legacy secondary TTS (no longer primary) | `ELEVENLABS_API_KEY` |

---

## Environment Variables

All keys must be set in the server environment. They are configured in `.env` and also in `ecosystem.config.cjs` for the PM2 process manager.

```

OPENAI_API_KEY=sk-...

HEYGEN_API_KEY=sk_V2_hgu_kHPSnIq8Fxq_...

ELEVENLABS_API_KEY=... # legacy, kept but not primary

ADMIN_PASSWORD=... # protects admin-only routes

SITE_URL=https://your-domain.com # used during RAG re-indexing

```

---

## Key Source Files

| File | Role |

|---|---|

| `server/openai.ts` | All AI/TTS generation functions |

| `server/routes.ts` | All chatbot API routes + speech cache |

| `server/embeddings.ts` | RAG context retrieval + content indexing |

| `server/auth.ts` | Password hashing/verification (scrypt) |

| `server/storage.ts` | Database access layer (Drizzle ORM) |

| `shared/schema.ts` | PostgreSQL schema definitions |

| `client/src/components/ChatBot.tsx` | React UI component (floating widget) |

| `client/src/hooks/use-tts.ts` | Reusable TTS hook (calls `/api/tts`) |

---

## Database Schema

Defined in `shared/schema.ts` using Drizzle ORM against PostgreSQL.

### `visitors` table

```

id serial PRIMARY KEY

name text NOT NULL

email text NOT NULL

passwordHash text -- scrypt hash

lastState text

createdAt timestamp

```

### `conversations` table

```

id serial PRIMARY KEY

visitorId integer ? visitors.id

createdAt timestamp

```

### `messages` table

```

id serial PRIMARY KEY

conversationId integer ? conversations.id

role text -- 'user' | 'assistant'

content text

createdAt timestamp

```

### `resources` table (exercise library)

```

id serial PRIMARY KEY

title text

description text

content text

type text -- 'breathing' | 'somatic' | 'grounding' | 'creative'

category text -- 'sympathetic' | 'parasympathetic' | 'creative'

imageUrl text

```

### `resource_embeddings` table (RAG)

```

id serial PRIMARY KEY

resourceId integer ? resources.id

pageId integer ? pages.id

embedding jsonb -- float[] vector from text-embedding-3-small

contentType text -- 'resource' | 'page'

url text -- direct link to the content

```

### `knowledge_documents` table (admin knowledge base)

```

id serial PRIMARY KEY

title text

sourceType text -- 'upload' | 'url' | 'manual'

sourceUrl text

content text

contentType text -- 'pdf' | 'txt' | 'url' | 'manual'

embedding jsonb

createdAt timestamp

updatedAt timestamp

```

---

## Step-by-Step: How the Voice Clone Was Created

### Step 1 — Voice Recording (External to Codebase)

1. Record clean audio samples of the target voice (Felicia) — typically 1–5 minutes of clear speech with no background noise.

2. Log in to [HeyGen Studio](https://app.heygen.com) ? **Voices** ? **Voice Clone**.

3. Upload the audio samples and create a **Instant Voice Clone** or **Professional Voice Clone**.

4. Once processed, HeyGen assigns a **Voice ID** (UUID). This is the cloned voice's identifier.

### Step 2 — Store the Voice ID

The Voice ID is hardcoded as a constant in `server/openai.ts`:

```ts

// server/openai.ts (line 97)

const FELICIA_VOICE_ID = "b5d52e83e8fe4a34a24bc5cffd2ada3a";

```

This ID is also referenced as `FELICIA_VOICE_CLONE` on line 147 (legacy ElevenLabs constant with same value, kept for reference).

### Step 3 — Store the API Key

Add the HeyGen API key to the server environment:

```

HEYGEN_API_KEY=sk_V2_hgu_kHPSnIq8Fxq_...

```

In `ecosystem.config.cjs`, this is passed via PM2's `env` block so the Node.js process receives it at runtime.

---

## Step-by-Step: Full Chat + Voice Workflow

### Phase 1 — User Authentication

1. User clicks the floating heart button ? `ChatBot.tsx` opens.

2. User registers (`POST /api/chat/register`) or logs in (`POST /api/chat/login`).

- **Register:** `name`, `email`, `password` ? password hashed with `scrypt` ? `visitors` row + `conversations` row created ? returns `{ visitor, conversation }`.

- **Login:** email + password ? verified against stored `scrypt` hash ? returns `{ visitor, conversations[] }`.

3. Session is persisted in browser `sessionStorage` as `chat_visitor` and `chat_conversation` keys.

### Phase 2 — Sending a Message

1. User types a message and submits the form ? `POST /api/chat/message` with `{ conversationId, content }`.

2. Server saves the user message to the `messages` table (`role: 'user'`).

3. Full conversation history is retrieved from the database.

4. Server scans past assistant messages to detect if one-time phrases ("Dear one", "you are not broken") have already been used.

### Phase 3 — RAG Context Retrieval

1. The user's latest message is vectorised via `generateEmbedding()` ? OpenAI `text-embedding-3-small` model ? returns a `float[]` vector.

2. `findRelevantContext()` in `server/embeddings.ts` fetches all stored embeddings from three sources:

- `resource_embeddings` (trauma exercises)

- `page_embeddings` (institutional pages)

- `knowledge_documents` (admin-uploaded knowledge base)

3. **Cosine similarity** is computed between the query vector and every stored embedding.

4. The top-5 most similar items are assembled into a context string, each including title, type, URL, description, and up to 500 characters of content.

### Phase 4 — AI Response Generation

1. `getChatCompletion()` in `server/openai.ts` sends the following to OpenAI `gpt-4o`:

- **System prompt:** Defines the persona "The Heart Space" (blend of Louise Hay and Sarah Blondin), tone rules, one-time phrase guards, exercise suggestion format (`[[Title|URL]]`), and the RAG context.

- **Message history:** All prior user + assistant messages in the conversation.

- **Temperature:** `0.7`

2. The AI response is returned and saved to the `messages` table (`role: 'assistant'`).

### Phase 5 — Pre-Generated Speech Cache (Background)

Immediately after saving the AI message, the server **pre-generates audio in the background** (without blocking the HTTP response):

```ts

// server/routes.ts (lines 248–254)

generateHeyGenSpeech(cleanTextForSpeech(aiResponse)).then(result => {

if ("audio" in result) {

speechCache.set(msgId, { buffer: result.audio, expiresAt: Date.now() + 15 * 60 * 1000 });

}

}).catch(() => {});

```

- The audio `Buffer` is stored in a **server-side in-memory Map** (`speechCache`) keyed by `messageId`.

- Cache entries **expire after 15 minutes** and are purged every 5 minutes.

- This means when the user clicks "Listen", the audio is already ready.

### Phase 6 — Voice Playback ("Listen" Button)

1. User clicks the **Listen** button below any assistant message ? `playSpeech(id, text)` runs in `ChatBot.tsx`.

2. Client calls `POST /api/chat/speech` with `{ text, messageId }`.

3. Server checks in-memory `speechCache`:

- **Cache HIT:** Returns the pre-buffered `audio/mpeg` binary immediately, then deletes the entry.

- **Cache MISS:** Calls `generateHeyGenSpeech(cleanText)` on-demand.

4. `generateHeyGenSpeech()` in `server/openai.ts`:

- **Step 1:** `POST https://api.heygen.com/v3/voices/speech` with `{ text, voice_id: FELICIA_VOICE_ID, speed: 1.0 }` ? HeyGen returns `{ data: { audio_url, duration } }`.

- **Step 2:** Server fetches the binary audio from `audio_url` and returns it as an `audio/mpeg` stream to the client.

5. **Fallback:** If HeyGen fails (API key missing, network error, or API error), server falls back to OpenAI `tts-1-hd` with the `nova` voice at `speed: 0.9`.

6. Client receives the `Blob`, creates an object URL, constructs an `HTMLAudioElement`, and calls `.play()`.

### Phase 7 — Text Pre-Processing for Speech

Before any text is sent to the TTS providers, it is cleaned by `cleanTextForSpeech()`:

```ts

// server/routes.ts (lines 19–24)

function cleanTextForSpeech(text: string): string {

return text

.replace(/\[\[(.*?)\|(.*?)\]\]/g, "I suggest the $1") // [[Title|URL]] ? spoken

.replace(/\[(.*?)\]$(.*?)$/g, "$1") // [Title](URL) ? title only

.replace(/[*_#]/g, ""); // strip markdown

}

```

---

## API Route Reference

| Method | Route | Description |

|---|---|---|

| `POST` | `/api/chat/register` | Create account ? visitor + conversation |

| `POST` | `/api/chat/login` | Authenticate ? visitor + conversations list |

| `POST` | `/api/chat/new-conversation` | Create a new conversation for existing visitor |

| `GET` | `/api/chat/visitor/:visitorId/conversations` | List all conversation previews |

| `POST` | `/api/chat/start` | Legacy guest start (no password) |

| `POST` | `/api/chat/message` | Send message ? triggers RAG + GPT-4o + voice pre-gen |

| `GET` | `/api/chat/history/:conversationId` | Retrieve all messages in a conversation |

| `POST` | `/api/chat/speech` | Get audio for a message (cache-first, HeyGen primary) |

| `POST` | `/api/tts` | Generic TTS via ElevenLabs (used by `useTTS` hook) |

| `POST` | `/api/admin/index-knowledge` | Re-index all resources + pages into vector DB |

---

## HeyGen API Details

**Endpoint:** `POST https://api.heygen.com/v3/voices/speech`

**Request headers:**

```

X-Api-Key: <HEYGEN_API_KEY>

Content-Type: application/json

```

**Request body:**

```json

{

"text": "Your response text here",

"voice_id": "b5d52e83e8fe4a34a24bc5cffd2ada3a",

"speed": 1.0

}

```

**Response body:**

```json

{

"error": null,

"data": {

"audio_url": "https://cdn.heygen.com/.../audio.mp3",

"duration": 4.2

}

```

The server then fetches the binary from `audio_url` and proxies it directly as `audio/mpeg` to the client — the client never touches the HeyGen CDN directly.

---

## OpenAI TTS Fallback Details

**Model:** `tts-1-hd`

**Voice:** `nova` (warmer and more expressive)

**Speed:** `0.9` (slightly slower to avoid monotonous delivery)

Called via the official `openai` Node.js SDK:

```ts

openai.audio.speech.create({ model: "tts-1-hd", voice: "nova", input: text, speed: 0.9 })

```

---

## Process Management

The server is managed by **PM2** with the app name `trauma-navigator`.

```bash

pm2 start ecosystem.config.cjs # start

pm2 restart trauma-navigator # restart after code changes

pm2 logs trauma-navigator # view logs

pm2 status # check running status

```

---

## How to Replace the Cloned Voice

1. Record new voice samples and create a new Voice Clone in HeyGen Studio.

2. Copy the new Voice ID from the HeyGen dashboard.

3. Update `FELICIA_VOICE_ID` in `server/openai.ts` (line 97):

```ts

const FELICIA_VOICE_ID = "<new-voice-id-here>";

```

4. Restart the server: `pm2 restart trauma-navigator`

---

## Speech Cache Architecture

```

POST /api/chat/message

??? Save user msg

??? RAG lookup

??? GPT-4o completion

??? Save assistant msg (msgId = N)

??? [background] generateHeyGenSpeech(text)

??? speechCache.set(N, { buffer, expiresAt })

POST /api/chat/speech { messageId: N }

??? Cache HIT ? return buffer, delete entry (instant)

??? Cache MISS ? generateHeyGenSpeech on-demand (2–4s)

```

Cache TTL: **15 minutes**. Purge interval: **5 minutes**.

__________________________________________________________

Integrate The Heart Space Chatbot into a New Project

This plan walks a new developer through adding the full AI chatbot (GPT-4o + HeyGen voice clone + RAG) from Trauma Navigator into a new Node.js / Express / React / PostgreSQL / Drizzle / PM2 project.

Prerequisites — gather before starting

OpenAI API key — needs access to gpt-4o, text-embedding-3-small, and tts-1-hd
HeyGen API key — sk_V2_... format, from app.heygen.com ? Settings ? API
HeyGen Voice ID — see Step 3 below for how to clone a voice and get the UUID
PostgreSQL database running and accessible via DATABASE_URL
Node.js ? 20, npm, tsx, pm2 installed globally

Step 1 — Copy source files into the new project

Copy these files/directories verbatim from the Trauma Navigator repo into your project root:

Source pathWhat it isopenai.tsGPT-4o completion + HeyGen + OpenAI TTS functionsroutes.tsAll chatbot API routes + speech cache logicembeddings.tsRAG retrieval: cosine similarity + context builderauth.tsscrypt password hashing / verificationstorage.tsDrizzle database access layerschema.tsDrizzle schema (all tables)ChatBot.tsxReact floating chat widgetuse-tts.tsGeneric TTS hook (calls /api/tts)

Register routes — in your main

index.ts, import and mount the routes from routes.ts with app.use(routes).

Step 2 — Install npm dependencies

Add these packages to package.json and run npm install:

openai            pg    drizzle-orm    drizzle-zod    drizzle-kit
express           multer             pdf-parse
zod               zod-validation-error
bcryptjs          @types/bcryptjs
express-session   connect-pg-simple  memorystore
passport          passport-local

For the React client, these are already present if you scaffolded with Vite + Shadcn, but confirm:

@tanstack/react-query   lucide-react   wouter   framer-motion

Step 3 — Create the HeyGen voice clone (or reuse existing)

Skip this step if you have an existing Voice ID.

Record 1–5 minutes of clean speech (no background noise) from the target voice.
Log in to HeyGen Studio ? Voices ? Voice Clone.
Upload samples ? create Instant Voice Clone (or Professional for higher quality).
Once processed, copy the Voice ID UUID shown in the dashboard.

Step 4 — Set the Voice ID in the codebase

openai.ts, update line 97:
ts

const FELICIA_VOICE_ID = "<your-voice-id-uuid-here>";

Step 5 — Configure environment variables

Create a .env file in the project root:

DATABASE_URL=postgres://user:password@host:5432/dbname
OPENAI_API_KEY=sk-...
HEYGEN_API_KEY=sk_V2_...
ELEVENLABS_API_KEY=           # optional — legacy fallback, can be left empty
ADMIN_PASSWORD=your-admin-pw  # protects /api/admin/* routes
SITE_URL=https://your-domain.com

For PM2, mirror all keys in ecosystem.config.cjs inside the env block so the production process receives them:

js

module.exports = {
  apps: [{
    name: "your-app-name",
    script: "dist/index.cjs",
    env: {
      NODE_ENV: "production",
      DATABASE_URL: "...",
      OPENAI_API_KEY: "...",
      HEYGEN_API_KEY: "...",
      ADMIN_PASSWORD: "...",
      SITE_URL: "..."
    }
  }]
};

Step 6 — Apply the database schema

Run the Drizzle migration to create all required tables in PostgreSQL:

bash

npm run db:push

This reads

schema.ts and pushes the schema against `DATABASE_URL`. Tables created: `visitors`, `conversations`, `messages`, `resources`, `pages`, `resource_embeddings`, `page_embeddings`, `knowledge_documents`Step 7 — Seed resources (exercise library) for RAG

The RAG system searches the resources and pages tables. Without content, the chatbot still works but won't suggest exercises.

Insert resource rows into the resources table (title, description, content, type, category).
Insert page rows into the pages table if you have institutional content.

Step 8 — Build the project

bash

npm run build

This compiles the Express server to

index.cjs and bundles the React client into public.Step 9 — Start with PM2

bash

pm2 start ecosystem.config.cjs
pm2 save               # persist across reboots
pm2 startup            # install startup hook

Verify:

bash

pm2 status             # should show "online"
pm2 logs your-app-name # watch for startup errors

Step 10 — Run RAG indexing (generates embeddings)

After the server is running, trigger initial embedding generation:

bash

curl -X POST https://your-domain.com/api/admin/index-knowledge \
  -H "x-admin-password: your-admin-pw"

This vectorises all resources, pages, and knowledge documents using text-embedding-3-small and writes the results to resource_embeddings, page_embeddings, and knowledge_documents.

Re-run any time you add new resources or pages.

Step 11 — Verify end-to-end

Open the app in a browser ? the floating heart button should appear (rendered by ChatBot.tsx).
Register a new account ? verify a visitors row and conversations row are created in the DB.
Send a message ? confirm an AI response is returned.
Click the Listen button ? audio should play (2–4s on cache miss, instant on cache hit).
Check PM2 logs for any HeyGen API errors; if voice fails, the fallback OpenAI nova voice will play automatically.

Troubleshooting quick reference

SymptomLikely causeFixDATABASE_URL error on startupEnv var missingCheck .env and ecosystem.config.cjsNo AI responseOPENAI_API_KEY invalid or model not enabledVerify key in OpenAI dashboardVoice plays OpenAI nova instead of cloned voiceHEYGEN_API_KEY wrong or Voice ID incorrectCheck HeyGen dashboard; update .env and FELICIA_VOICE_ID constantRAG returns no contextEmbeddings not indexedRe-run POST /api/admin/index-knowledgepm2 logs shows port conflictAnother process on same portChange PORT in .env or stop the conflicting process

cdn.heygen.com

your-domain.com

api.heygen.com

HeyGen - AI Spokesperson Video Creator

host

your-domain.com

Comments (0)

Chatbot with Voice Clone

Integrate The Heart Space Chatbot into a New Project

Prerequisites — gather before starting

Step 1 — Copy source files into the new project

Step 2 — Install npm dependencies

Step 3 — Create the HeyGen voice clone (or reuse existing)

Step 4 — Set the Voice ID in the codebase

Step 5 — Configure environment variables

Step 6 — Apply the database schema

schema.ts and pushes the schema against DATABASE_URL. Tables created: visitors, conversations, messages, resources, pages, resource_embeddings, page_embeddings, knowledge_documentsStep 7 — Seed resources (exercise library) for RAG

Step 8 — Build the project

index.cjs and bundles the React client into public.Step 9 — Start with PM2

Step 10 — Run RAG indexing (generates embeddings)

Step 11 — Verify end-to-end

Troubleshooting quick reference

schema.ts and pushes the schema against `DATABASE_URL`. Tables created: `visitors`, `conversations`, `messages`, `resources`, `pages`, `resource_embeddings`, `page_embeddings`, `knowledge_documents`Step 7 — Seed resources (exercise library) for RAG