Skip to content

Commit

Permalink
Added Python C extensions cdtw and cmfcc. Code cleanup. RATEAGGRESSIV…
Browse files Browse the repository at this point in the history
…E and OFFSET boundary adj algos. Code cleanup.

Added Python C Extension for computing the DTW, and supporting scripts to install/compile it

Added unit test for cdtw

Added cmfcc and the Python-C-Python glue code. TODO: testing.

C code cleanup

Added unit tests for cmfcc and AbstractWave

Moved MFCC extraction code into AudioFile. Added OFFSET and RATEAGGRESSIVE boundary adj algos. Added tools.extract_mfcc

Code cleanups. Most of the planned long tests done. Ready to release as v1.1.0.
  • Loading branch information
Alberto Pettarin committed Aug 20, 2015
1 parent b1478e8 commit 4601bdf
Show file tree
Hide file tree
Showing 63 changed files with 2,931 additions and 704 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
*.py[cdo]
*.swp
*.so
MANIFEST
aeneas/build
bak
dist
docs/build
Expand Down
23 changes: 17 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

**aeneas** is a Python library and a set of tools to automagically synchronize audio and text.

* Version: 1.0.4
* Date: 2015-08-09
* Version: 1.1.0
* Date: 2015-08-21
* Developed by: [ReadBeyond](http://www.readbeyond.it/)
* Lead Developer: [Alberto Pettarin](http://www.albertopettarin.it/)
* License: the GNU Affero General Public License Version 3 (AGPL v3)
Expand Down Expand Up @@ -85,6 +85,7 @@ for example using [aeneas-vagrant](https://github.com/readbeyond/aeneas-vagrant)
$ git clone https://github.com/readbeyond/aeneas.git
$ cd aeneas
$ pip install -r requirements.txt
$ bash compile_c_extensions.sh
$ python check_dependencies.py
```

Expand All @@ -99,18 +100,24 @@ If you get an error, try running the
$ sudo bash install_dependencies.sh
```

and then try running `check_dependencies.py` again.
and then try running `compile_c_extensions.sh` and `check_dependencies.py` again.

If you are a Windows user, please read
[these directions](https://groups.google.com/d/msg/aeneas-forced-alignment/p9cb1FA0X0I/8phzUgIqBAAJ),
written by Richard Margetts.

Alternatively, consider using the [Vagrant box](http://www.vagrantup.com)
created by [aeneas-vagrant](https://github.com/readbeyond/aeneas-vagrant).


## Usage

1. Clone this GitHub repo:

```bash
$ git clone https://github.com/readbeyond/aeneas.git
$ cd aeneas
$ bash compile_c_extensions.sh
```

2. To compute a SMIL synchronization map `map.smil` for a pair
Expand Down Expand Up @@ -168,6 +175,8 @@ Tutorial: [A Practical Introduction To The aeneas Package](http://www.albertopet

Mailing list: [https://groups.google.com/d/forum/aeneas-forced-alignment](https://groups.google.com/d/forum/aeneas-forced-alignment)

Changelog: [http://www.readbeyond.it/aeneas/docs/changelog.html](http://www.readbeyond.it/aeneas/docs/changelog.html)


## Supported Features

Expand All @@ -181,20 +190,19 @@ Mailing list: [https://groups.google.com/d/forum/aeneas-forced-alignment](https:
* Robust against misspelled/mispronounced words, local rearrangements of words, background noise/sporadic spikes
* Code suitable for a Web app deployment (e.g., on-demand AWS instances)
* Adjustable splitting times, including a max character/second constraint for CC applications
* MFCC and DTW computed as Python C extensions to reduce the processing time


## Limitations and Missing Features

* Audio should match the text: large portions of spurious text or audio might produce a wrong sync map
* Audio is assumed to be spoken: not suitable/YMMV for song captioning
* Offline (i.e., not real time/near real time) approach
* DTW computation is memory hungry
* No protection against memory trashing


## TODO List

* Improving the speed of the code, especially when Sakoe-Chiba kicks in
* Improving robustness against music in background
* Isolate non-speech intervals (music, prolonged silence)
* Automated text fragmentation based on audio analysis
Expand All @@ -203,7 +211,7 @@ Mailing list: [https://groups.google.com/d/forum/aeneas-forced-alignment](https:
* Improving (removing?) dependency from `espeak`, `ffmpeg`, `ffprobe` executables
* Multilevel sync map granularity (e.g., multilevel SMIL output)
* Supporting input text encodings other than UTF-8
* Adding (testing) more languages
* Adding (i.e., testing) more languages
* Better documentation
* Testing other approaches, like HMM
* Publishing the package on PyPI
Expand Down Expand Up @@ -336,6 +344,8 @@ and a Web application
**May 2015**: release of this package on GitHub
**August 2015**: release of v1.1.0, including Python C extensions
to speed the computation of audio/text alignment up
## Acknowledgments
Expand All @@ -350,3 +360,4 @@ helped shaping the structure of this package
for its asynchronous usage.
24 changes: 18 additions & 6 deletions README.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ aeneas
**aeneas** is a Python library and a set of tools to automagically
synchronize audio and text.

- Version: 1.0.4
- Date: 2015-08-09
- Version: 1.1.0
- Date: 2015-08-21
- Developed by: `ReadBeyond <http://www.readbeyond.it/>`__
- Lead Developer: `Alberto Pettarin <http://www.albertopettarin.it/>`__
- License: the GNU Affero General Public License Version 3 (AGPL v3)
Expand Down Expand Up @@ -92,6 +92,7 @@ Installation
$ git clone https://github.com/readbeyond/aeneas.git
$ cd aeneas
$ pip install -r requirements.txt
$ bash compile_c_extensions.sh
$ python check_dependencies.py

If the last command prints a success message, you have all the required
Expand All @@ -105,7 +106,12 @@ If you get an error, try running the `provided

$ sudo bash install_dependencies.sh

and then try running ``check_dependencies.py`` again.
and then try running ``compile_c_extensions.sh`` and
``check_dependencies.py`` again.

If you are a Windows user, please read `these
directions <https://groups.google.com/d/msg/aeneas-forced-alignment/p9cb1FA0X0I/8phzUgIqBAAJ>`__,
written by Richard Margetts.

Alternatively, consider using the `Vagrant
box <http://www.vagrantup.com>`__ created by
Expand All @@ -120,6 +126,7 @@ Usage

$ git clone https://github.com/readbeyond/aeneas.git
$ cd aeneas
$ bash compile_c_extensions.sh

2. To compute a SMIL synchronization map ``map.smil`` for a pair
(``audio.mp3``, ``text.txt``), you can run:
Expand Down Expand Up @@ -177,6 +184,8 @@ Package <http://www.albertopettarin.it/blog/2015/05/21/a-practical-introduction-

Mailing list: https://groups.google.com/d/forum/aeneas-forced-alignment

Changelog: http://www.readbeyond.it/aeneas/docs/changelog.html

Supported Features
------------------

Expand All @@ -197,21 +206,21 @@ Supported Features
instances)
- Adjustable splitting times, including a max character/second
constraint for CC applications
- MFCC and DTW computed as Python C extensions to reduce the processing
time

Limitations and Missing Features
--------------------------------

- Audio should match the text: large portions of spurious text or audio
might produce a wrong sync map
- Audio is assumed to be spoken: not suitable/YMMV for song captioning
- Offline (i.e., not real time/near real time) approach
- DTW computation is memory hungry
- No protection against memory trashing

TODO List
---------

- Improving the speed of the code, especially when Sakoe-Chiba kicks in
- Improving robustness against music in background
- Isolate non-speech intervals (music, prolonged silence)
- Automated text fragmentation based on audio analysis
Expand All @@ -221,7 +230,7 @@ TODO List
``ffprobe`` executables
- Multilevel sync map granularity (e.g., multilevel SMIL output)
- Supporting input text encodings other than UTF-8
- Adding (testing) more languages
- Adding (i.e., testing) more languages
- Better documentation
- Testing other approaches, like HMM
- Publishing the package on PyPI
Expand Down Expand Up @@ -345,6 +354,9 @@ application

**May 2015**: release of this package on GitHub

**August 2015**: release of v1.1.0, including Python C extensions to
speed the computation of audio/text alignment up

Acknowledgments
---------------

Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.0.4
1.1.0
6 changes: 3 additions & 3 deletions aeneas/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
from aeneas.job import Job, JobConfiguration
from aeneas.language import Language
from aeneas.logger import Logger
#from aeneas.mfcc
from aeneas.syncmap import SyncMap, SyncMapFragment, SyncMapFormat
from aeneas.synthesizer import Synthesizer
from aeneas.task import Task, TaskConfiguration
Expand All @@ -34,10 +33,11 @@
__author__ = "Alberto Pettarin"
__copyright__ = """
Copyright 2012-2013, Alberto Pettarin (www.albertopettarin.it)
Copyright 2013-2015, ReadBeyond Srl (www.readbeyond.it)
Copyright 2013-2015, ReadBeyond Srl (www.readbeyond.it)
Copyright 2015, Alberto Pettarin (www.albertopettarin.it)
"""
__license__ = "GNU AGPL v3"
__version__ = "1.0.4"
__version__ = "1.1.0"
__email__ = "[email protected]"
__status__ = "Production"

Expand Down
Loading

0 comments on commit 4601bdf

Please sign in to comment.