Added Python C extensions cdtw and cmfcc. Code cleanup. RATEAGGRESSIV…

…E and OFFSET boundary adj algos. Code cleanup. Added Python C Extension for computing the DTW, and supporting scripts to install/compile it Added unit test for cdtw Added cmfcc and the Python-C-Python glue code. TODO: testing. C code cleanup Added unit tests for cmfcc and AbstractWave Moved MFCC extraction code into AudioFile. Added OFFSET and RATEAGGRESSIVE boundary adj algos. Added tools.extract_mfcc Code cleanups. Most of the planned long tests done. Ready to release as v1.1.0.
a-1an · Aug 20, 2015 · 4601bdf · 4601bdf
1 parent b1478e8
commit 4601bdf
Show file tree

Hide file tree

Showing 63 changed files with 2,931 additions and 704 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,6 +1,8 @@
 *.py[cdo]
 *.swp
+*.so
 MANIFEST
+aeneas/build
 bak
 dist
 docs/build

diff --git a/README.md b/README.md
@@ -2,8 +2,8 @@
 
 **aeneas** is a Python library and a set of tools to automagically synchronize audio and text.
 
-* Version: 1.0.4
-* Date: 2015-08-09
+* Version: 1.1.0
+* Date: 2015-08-21
 * Developed by: [ReadBeyond](http://www.readbeyond.it/)
 * Lead Developer: [Alberto Pettarin](http://www.albertopettarin.it/)
 * License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -85,6 +85,7 @@ for example using [aeneas-vagrant](https://github.com/readbeyond/aeneas-vagrant)
 $ git clone https://github.com/readbeyond/aeneas.git
 $ cd aeneas
 $ pip install -r requirements.txt
+$ bash compile_c_extensions.sh 
 $ python check_dependencies.py
 ```
 
@@ -99,18 +100,24 @@ If you get an error, try running the
 $ sudo bash install_dependencies.sh
 ```
 
-and then try running `check_dependencies.py` again.
+and then try running `compile_c_extensions.sh` and `check_dependencies.py` again.
+
+If you are a Windows user, please read
+[these directions](https://groups.google.com/d/msg/aeneas-forced-alignment/p9cb1FA0X0I/8phzUgIqBAAJ),
+written by Richard Margetts.
 
 Alternatively, consider using the [Vagrant box](http://www.vagrantup.com)
 created by [aeneas-vagrant](https://github.com/readbeyond/aeneas-vagrant).
 
+
 ## Usage
 
 1. Clone this GitHub repo:
 
     ```bash
     $ git clone https://github.com/readbeyond/aeneas.git
     $ cd aeneas
+    $ bash compile_c_extensions.sh     
     ```
 
 2. To compute a SMIL synchronization map `map.smil` for a pair
@@ -168,6 +175,8 @@ Tutorial: [A Practical Introduction To The aeneas Package](http://www.albertopet
 
 Mailing list: [https://groups.google.com/d/forum/aeneas-forced-alignment](https://groups.google.com/d/forum/aeneas-forced-alignment)
 
+Changelog: [http://www.readbeyond.it/aeneas/docs/changelog.html](http://www.readbeyond.it/aeneas/docs/changelog.html)
+
 
 ## Supported Features
 
@@ -181,20 +190,19 @@ Mailing list: [https://groups.google.com/d/forum/aeneas-forced-alignment](https:
 * Robust against misspelled/mispronounced words, local rearrangements of words, background noise/sporadic spikes
 * Code suitable for a Web app deployment (e.g., on-demand AWS instances)
 * Adjustable splitting times, including a max character/second constraint for CC applications
+* MFCC and DTW computed as Python C extensions to reduce the processing time
 
 
 ## Limitations and Missing Features 
 
 * Audio should match the text: large portions of spurious text or audio might produce a wrong sync map
 * Audio is assumed to be spoken: not suitable/YMMV for song captioning
-* Offline (i.e., not real time/near real time) approach
 * DTW computation is memory hungry
 * No protection against memory trashing
 
 
 ## TODO List
 
-* Improving the speed of the code, especially when Sakoe-Chiba kicks in
 * Improving robustness against music in background
 * Isolate non-speech intervals (music, prolonged silence)
 * Automated text fragmentation based on audio analysis
@@ -203,7 +211,7 @@ Mailing list: [https://groups.google.com/d/forum/aeneas-forced-alignment](https:
 * Improving (removing?) dependency from `espeak`, `ffmpeg`, `ffprobe` executables
 * Multilevel sync map granularity (e.g., multilevel SMIL output)
 * Supporting input text encodings other than UTF-8
-* Adding (testing) more languages
+* Adding (i.e., testing) more languages
 * Better documentation
 * Testing other approaches, like HMM
 * Publishing the package on PyPI
@@ -336,6 +344,8 @@ and a Web application
 
 **May 2015**: release of this package on GitHub
 
+**August 2015**: release of v1.1.0, including Python C extensions
+to speed the computation of audio/text alignment up
 
 ## Acknowledgments
 
@@ -350,3 +360,4 @@ helped shaping the structure of this package
 for its asynchronous usage.
 
 
+
diff --git a/README.txt b/README.txt
@@ -4,8 +4,8 @@ aeneas
 **aeneas** is a Python library and a set of tools to automagically
 synchronize audio and text.
 
--  Version: 1.0.4
--  Date: 2015-08-09
+-  Version: 1.1.0
+-  Date: 2015-08-21
 -  Developed by: `ReadBeyond <http://www.readbeyond.it/>`__
 -  Lead Developer: `Alberto Pettarin <http://www.albertopettarin.it/>`__
 -  License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -92,6 +92,7 @@ Installation
     $ git clone https://github.com/readbeyond/aeneas.git
     $ cd aeneas
     $ pip install -r requirements.txt
+    $ bash compile_c_extensions.sh 
     $ python check_dependencies.py
 
 If the last command prints a success message, you have all the required
@@ -105,7 +106,12 @@ If you get an error, try running the `provided
 
     $ sudo bash install_dependencies.sh
 
-and then try running ``check_dependencies.py`` again.
+and then try running ``compile_c_extensions.sh`` and
+``check_dependencies.py`` again.
+
+If you are a Windows user, please read `these
+directions <https://groups.google.com/d/msg/aeneas-forced-alignment/p9cb1FA0X0I/8phzUgIqBAAJ>`__,
+written by Richard Margetts.
 
 Alternatively, consider using the `Vagrant
 box <http://www.vagrantup.com>`__ created by
@@ -120,6 +126,7 @@ Usage
 
        $ git clone https://github.com/readbeyond/aeneas.git
        $ cd aeneas
+       $ bash compile_c_extensions.sh     
 
 2. To compute a SMIL synchronization map ``map.smil`` for a pair
    (``audio.mp3``, ``text.txt``), you can run:
@@ -177,6 +184,8 @@ Package <http://www.albertopettarin.it/blog/2015/05/21/a-practical-introduction-
 
 Mailing list: https://groups.google.com/d/forum/aeneas-forced-alignment
 
+Changelog: http://www.readbeyond.it/aeneas/docs/changelog.html
+
 Supported Features
 ------------------
 
@@ -197,21 +206,21 @@ Supported Features
    instances)
 -  Adjustable splitting times, including a max character/second
    constraint for CC applications
+-  MFCC and DTW computed as Python C extensions to reduce the processing
+   time
 
 Limitations and Missing Features
 --------------------------------
 
 -  Audio should match the text: large portions of spurious text or audio
    might produce a wrong sync map
 -  Audio is assumed to be spoken: not suitable/YMMV for song captioning
--  Offline (i.e., not real time/near real time) approach
 -  DTW computation is memory hungry
 -  No protection against memory trashing
 
 TODO List
 ---------
 
--  Improving the speed of the code, especially when Sakoe-Chiba kicks in
 -  Improving robustness against music in background
 -  Isolate non-speech intervals (music, prolonged silence)
 -  Automated text fragmentation based on audio analysis
@@ -221,7 +230,7 @@ TODO List
    ``ffprobe`` executables
 -  Multilevel sync map granularity (e.g., multilevel SMIL output)
 -  Supporting input text encodings other than UTF-8
--  Adding (testing) more languages
+-  Adding (i.e., testing) more languages
 -  Better documentation
 -  Testing other approaches, like HMM
 -  Publishing the package on PyPI
@@ -345,6 +354,9 @@ application
 
 **May 2015**: release of this package on GitHub
 
+**August 2015**: release of v1.1.0, including Python C extensions to
+speed the computation of audio/text alignment up
+
 Acknowledgments
 ---------------
 

diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-1.0.4
+1.1.0
diff --git a/aeneas/__init__.py b/aeneas/__init__.py
@@ -23,7 +23,6 @@
 from aeneas.job import Job, JobConfiguration
 from aeneas.language import Language
 from aeneas.logger import Logger
-#from aeneas.mfcc
 from aeneas.syncmap import SyncMap, SyncMapFragment, SyncMapFormat
 from aeneas.synthesizer import Synthesizer
 from aeneas.task import Task, TaskConfiguration
@@ -34,10 +33,11 @@
 __author__ = "Alberto Pettarin"
 __copyright__ = """
     Copyright 2012-2013, Alberto Pettarin (www.albertopettarin.it)
-    Copyright 2013-2015, ReadBeyond Srl (www.readbeyond.it)
+    Copyright 2013-2015, ReadBeyond Srl   (www.readbeyond.it)
+    Copyright 2015,      Alberto Pettarin (www.albertopettarin.it)
     """
 __license__ = "GNU AGPL v3"
-__version__ = "1.0.4"
+__version__ = "1.1.0"
 __email__ = "[email protected]"
 __status__ = "Production"