We are building chat-analytics using TypeScript, React, amCharts 5 and webpack 5.
The project is split into 3 main parts:
pipeline/
: the pipeline, which handles all the data from the input files to the final aggregated data. This is the core of the project; it is described in detail in PIPELINE.md.app/
: the main UI for generating reports, which is hosted in chatanalytics.app. It uses a WebWorker to run the pipeline.report/
: the report UI, which is exported as a single, static HTML file, acting like a placeholder. Then it is filled with the processed data from the pipeline to display the graphs and stats. Also uses its own WebWorker to aggregate the data.
A small but relevant part is:
lib/
: contains the entry point for the chat-analytics package, as well as the CLI.
See DEV.md for development instructions and guidelines.
The demo is an export from the DefleMask server, which is owned by a friend of mine who gave allowed me to use it as a demo. The input files are stored in a Google Drive zip and later downloaded during CI to build the demo HTML automatically using the CLI (with --demo
). It is updated manually by me using (we may want to move to a periodic workflow eventually):
docker run --rm -it -v $PWD/out:/out tyrrrz/discordchatexporter:stable exportguild -f json -g 253601524398293010 -t <token>
Zipping and then replacing the file in Google Drive (~280MB uncompressed, 23MB compressed).
We aim to handle millions of messages in a single report (at least 10M). This is a lot of data, we have to be careful with the amount of memory used when generating the report (since browser tabs like to crash when asking for too much memory) and we have to make sure reports don't get impractically big.