Random stuff to learn, what follows is old and incorrect but I'm not bothered to rewrite it.
| Language | Type | Dependencies |
|---|---|---|
| Rust/Python | Tools library | None |
Take text, generate phonemes. Not only text in input but if possible also time of the words so that the output can contain timings for phonemes as well.
| Language | Type | Dependencies |
|---|---|---|
| HTML5 | Tools library | None |
Take phonemes, show mouth animation. Take configuration of images to map them to phonemes. Probably take canvas context, position and show animation in canvas.
| Language | Type | Dependencies |
|---|---|---|
| HTML5 | Tools library | pran-phonemes#FRONTEND |
Take configuration for list of animations and phonemes (same as pran-phonemes#FRONTEND). Ability to trigger animations with also phonemes.
| Language | Type | Dependencies |
|---|---|---|
| Rust/Python | Application | pran-phonemes#CORE |
From audio to text, send text to pran-phonemes#CORE and send results back to listener. Local server to listen to from browser, probably websocket.
| Language | Type | Dependencies |
|---|---|---|
| HTML5 | Application | pran-animation#FRONTEND + pran-echo#CORE |
Configure animation frontend with very few animations but all the phonemes. Gets audio in input, sends it to pran-echo#CORE, wait for results back, sends results to pran-animation#FRONTEND. Ideally being able to choose animations to show at different times through an intuitive UI. Being able to output a gif/video of animation, use MediaRecorder.